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THE EFFECT OF SHOCK ON RECOGNITION THRESHOLDS! 


MICHAEL M. REECE? 
New York University 


N INCREASING amount of research 
A concerning the nature of perceptual 
processes is currently being done by 
investigators in the fields of social, personality, 
and clinical psychology. 

Several of these investigators (2, 6, 10) 
approach perception from such points of view 
as the relationship of motivation to percep- 
tion, the functional value to the organism of 
perceptual behavior, and the like. A number 
of experiments have resulted in data which are 
interpreted as indicating the importance of 
needs and values in “determining” perception. 

Other investigators decry the lack of ade- 
quate operational definitions in much of this 
kind of perceptual research. They feel that 
the evidence in favor of emotional determi- 
nants of perception is inadequate (12). They 
regard perception from the viewpoint of a 
response system reacting to varying stimulus 
configurations, but they disagree with the 
concepts of the former group concerning the 
kind of variables which may be expected to aid 
in an explanation of the phenomena. These 
workers feel that a great deal of perceptual 
behavior may be more adequately and par- 
simoniously described in terms of established 
principles of “‘modern association theory” 
(e.g., frequency); that perception is always 
inferred from responses of the organism and 
that these responses are learned (12). For 
them, the analysis of perceptual behavior is, 
therefore, part of the general problem of 
learning in terms of S-R relationships. 

Members of both groups agree that the effect 
of psychological variables (e.g., habit strength) 
on perception should be further investigated. 

Several studies have utilized recognition 
as a means of investigating the effect of such 

1 This article is based on portions of a dissertation 
submitted to the Graduate School of Arts and Sciences, 
New York University, in partial fulfillment of the re- 
quirements for the degree of Doctor of Philosophy. 
The author is indebted to Drs. L. W. Crafts, M. 
Deutsch, R. W. Gilbert, L. S. Kogan, and J. Zubin. 
The helpful suggestions and criticisms of Dr. J. McV. 
Hunt in the planning of this study are also gratefully 
acknowledged. 

2Now at Wayne University. 


factors as frequency and value upon percep- 
tion (2, 10, 12). Verbal recognition as a re- 
action to a specific stimulus situation is a 
learned response. As such, its acquisition may 
be assumed to proceed in accordance with the 
general principles of learning, i.e., in the same 
manner as does the acquisition of any response 
in any learning situation. 

Among these principles are the association 
of a stimulus situation and the response it 
evokes. In Dollard and Miller’s formulation 
(3), the stimulus acquires both a cue function 
and a drive function. Miller (3) also has 
described the manner in which previously 
neutral stimuli acquire a drive function moti- 
vating escape behavior when these stimuli 
are associated with pain and fear. 

In reward learning theory, the association 
of stimulus and response depends upon rein- 
forcement. For Hull (5) and his followers, 
reinforcement occurs through drive reduction. 
In the terms of Dollard and Miller, “‘a drive 
tends to elicit a considerable variety of re- 
sponses and has a stronger tendency to elicit 
some of them than others” (3, p. 192). Pain 
is included among the primary drives. Mowrer 
has shown how escape from punishing electric 
shock reinforces a response (8). For this study, 
it is assumed that the sudden reduction of a 
drive reinforces the response with which it is 
associated. 

According to reinforcement theory, drive 
impels the subject (S) to respond. In a specific 
situation, stimuli evoke a complex of responses. 
The response which successfully reduces the 
drive is strengthened and subsequently, in a 
similar situation, is more readily evoked in 
the presence of drive. If experimental arrange- 
ments in a learning situation permit a particu- 
lar response (e.g., verbal recognition) to be 
associated with reduction of a drive (e.g., 
pain) for some Ss and prevent such association 
for other Ss, that response would be reinforced 
relatively more in the Ss in whom the drive is 
reduced. 

The present experiment attempts to associ- 
ate pain (induced by shock) with stimuli 
(nonsense syllables) during a learning task. 
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This task requires the recognition and pro- 
nunciation of the same stimuli which are used 
in a perceptual recognition situation. 

In the learning conditions which allow the 
escape from the pain, half of the stimuli 
presentations (i.e., stimulus and response 
syllables simultaneously presented) are associ- 
ated with shock. The shock elicits pain which 
motivates escape behavior. The verbal re- 
sponse (i.e., pronunciation) is followed by the 
cessation of shock. Theoretically the escape 
from the shock reduces the pain and reinforces 
the response. With continued training, it is 
predicted that the latency of this response 
wil! decrease and readiness to respond increase. 

In the nonescape-shock conditions, half of 
the syllable-pairs are associated with shock 
in the same manner as in the escape-shock 
conditions. The shock elicits pain which 
evokes attempts to escape from the pain. 
However, in these conditions, the shock per- 
sists for the entire duration of the shock- 
syllable presentation. The pronunciation re- 
sponse is not followed by the cessation of 
shock and therefore is not reinforced as it is 
in the escape-shock conditions. No reward is 
provided for recognition and pronunciation. 
There may also be an inhibitory effect upon 
pronunciation owing to the competing and 
interfering responses which may result from 
the strong shock. 

From the foregoing, it follows that the 
latency of the pronunciation response should 
be decreased more in the escape-shock condi- 
tions than in the nonescape-shock conditions 
because this response is reinforced more in 
the former conditions. As recognition is 
inferred from pronunciation, the readiness to 
recognize the stimuli associated with shock 
should be relatively greater in the escape- 
shock conditions than in the nonescape-shock 
conditions. 

In an attempt to investigate the effects of 
differences in the acquired drive function of 
stimuli upon the recognition response, the 
experiment included conditions of varying 
potential predictability. The arrangements 
provided predictable shock conditions in which 
discrimination between shock and nonshock 
syllables was considered possible because the 
same syllables were consistently accompanied 
by shock, and other syllables were free from 
shock. As there would be no shock associated 
with certain syllables, the drive to escape 
would not be related directly to these syllables. 
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It was felt that the cue for the onset of shock 
and the acquired drive to escape from the 
shock would be restricted to the shock syl- 
lables. 

The design also provided unpredictable 
shock conditions in which no discrimination 
between shock and neutral syllables was pos- 
sible because the shock was administered in 
a random manner. It was felt that every syl- 
lable would become a cue for possible shock 
and that the drive to escape from the shock 
would become associated with each syllable. 

The posttraining tachistoscopic condition 
measures the thresholds of all syllables previ- 
ously used in the learning task. The syllables 
retain the drive and the cue functions which 
they acquired during the training procedure. 
Each syllable evokes the associated response 
characteristics (e.g., decreased or increased 
latency) developed during the learning. 

The basic hypothesis for the study is that 
the mean recognition-report threshold of 
nonsense syllables, half of which were previ- 
ously associated with shock in a learning 
situation permitting escape from the shock, 
will be lower than the mean threshold of 
syllables, half of which were previously associ- 
ated with inescapable shock in a similar 
situation. 

METHOD 


Subjects. The Ss were 70 undergraduate students 
divided into five equal groups. All Ss were volunteers 
and were assigned at random to the various conditions. 

A factorial design was used to investigate the factors 
of escapability and predictability of shock. The groups 
were predictable escape-shock, unpredictable escape- 
shock, predictable nonescape-shock, and unpredictable 
nonescape-shock. A nonshock group was added to test 
the effects of frequency and of pronunciation without 
shock. 

The visual acuity of each S was first determined by 
means of a Snellen Chart. The sequence of conditions 
of the experiment was then as follows: (a) initial ta- 
chistoscopic determination of recognition-report thresh- 
olds of the nonsense syllables, (5) first rating of pain 
induced by shock, (c) learning of the paired associates, 
(d) second rating of pain, (e) postlearning determina- 
tion of recognition-report thresholds. 

Learning procedure. A list of 20 monosyllabic, three- 
letter syllables from Glaze’s list (4) of “zero per cent 
association value” was used to form 10 paired associates. 
Two sets of lists were prepared, with all of the syllables 
which appeared on one set as stimulus syllables used 
as response syllables in the other set, and vice versa. 

The lists were presented in fixed rotating order (i.e., 
for S1 of each group the sequence of lists was 12341234- 
1234, for S2 of each group the sequence was 23412341- 
2341, for S3 the sequence was 341234123412, etc.). 
Shock syllables were randomly selected and presented 
to Ss in a rotated order so that each syllable-pair ap- 
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peared as a shock-pair. For the predictable shock 
conditions, the same syliable-pairs were consistently 
accompanied by shock and the remaining were con- 
sistently free from shock for any given individual. For 
the unpredictable shock conditions, the shock syllables 
were rotated in each trial so that, for each S, every 
pair of syllables appeared as a shock-pair during the 
learning procedure. 

A modified Bennett-Brown memory drum was used 
to present the syllables. The anticipation-prompting 
method was used, which involved (a) the presentation 
of the stimulus syllable alone followed by (5) the 
simultaneous presentation of this stimulus syllable and 
its associated response syllable. The Ss were instructed 
to anticipate aloud the response syllable when its as- 
sociated stimulus syllable was presented, regardless 
of whether or not the anticipation was correct. Presen- 
tation time of the syllables was two seconds. Inter- 
syllable-pair interval was two seconds and the inter- 
trial interval was six seconds. Learning was terminated 
at the end of the twelfth trial. 

Shock. The administration and intensity of the 
shock was manually controlled by EZ. A milliameter in 
the shock apparatus provided continuous readings of 
intensity. The electroband was fastened on the right 
arm so that the focal zinc electrode was placed on the 
volar surface of the wrist. Contact was reinforced by 
means of electrode paste. 

The shock was administered simultaneously with 
the presentation of the predetermined shock-syllable 
pairs. For the escape-shock groups the pronunciation 
of the response syllable was followed by cessation of the 
shock. For the nonescape-shock groups, shock persisted 
for the entire duration of exposure of the shock-pair 
on the memory drum. The intensity of shock for each 
S was set at the point at which a rating of strong pain 
was elicited. Ratings also determined the point at which 
pain was first felt and the point at which the pain was 
intolerable. The intensity was gradually increased 
during the learning period so that the intensity of shock 
during the last trial was as much or more than that 
which previously had been rated by S as iatolerable. 
Postlearning ratings of shock were obtained in order 
to provide a means of estimating the extent of adapta- 
tion to the shock. 

Recognition tests. The nonsense syllables of the 
learning task were tachistoscopically presented in 
various random orders prior to, and after, the learning. 
The ascending method of limits was used to determine 
the recognition thresholds. The first exposure duration 
was 10 msec. The duration of each succeeding exposure 
thereafter was increased by 20 msec. The exposure 
duration at which two successive correct reports were 
given by S was termed the threshold for that syllable. 
The threshold for a syllable was completely determined 
before proceeding to the next syllable. 

Tachistoscope. An electronically controlled tachisto- 
scope of the projection type was used which permitted 
precise control of exposure durations. Slides (2 in. X 2 
in.) of the syllables were projected by means of a TDC 
slide projector on the back of a Trans-Lux screen, with 
the size of the letters maintained at 3g in. X 3¢ in. A 
prefocused T-10, 200-watt bulb was used in the pro- 
jector. The exposure duration was determined by the 
opening and closing of the tachistoscope shutter which 
interrupted the projected beam of light. The brightness 
of the screen. was one foot candle (measured by the 
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light target held perpendicular to the front of the screen 
at a distance of 6 in.). Eye-level illumination at S’s 
position, 10 feet from the screen, was five foot candles. 

During the postexperimental interview, S was asked 
to state his interpretation of the shock. Other questions 
concerning his reactions to the experimental conditions, 
feelings, attitudes, etc. were included. The interview 
was clinically oriented, and particular care was taken 
to allay any fears which may have resulted from the 
experience of shock. 


RESULTS 

Adaptation to Shock 

A necessary condition for this study was 
the experience of pain by the Ss during the 
learning task. An administration of electric 
shock does not automatically assure an 
experience of pain since individuals vary 
greatly in their reactions to shock. Hence it 
was necessary to obtain an initial rating from 
each S of the pain induced by the shock in 
order to assure a roughly equivalent degree 
of pain in all shock Ss. However, repeated 
shocks tend to result in a lessened reaction to 
the shock; adaptation to shock is evidenced. 
The conditions of the present experiment 
necessitate a minimum of such adaptation. 

In order to provide an estimate of adapta- 
tion to the shock, postlearning ratings were 
also obtained and were compared with the 
prelearning ratings. Only three Ss (5.4 per 
cent) gave postlearning ratings of little or no 
pain, yielding a decrease of more than two 
units on the rating scale. Of these Ss, one was 
in the predictable nonescape-shock group and 
two were in the unpredictable nonescape- 
shock group. Of the other Ss (94.6 per cent), 
not one gave a postlearning rating of less 
than moderate pain; one S showed a decrease 
of two scale units, 13 Ss showed a decrease 
of one scale unit, and 32 Ss showed no decrease. 

These data suggest that the other results 
should not have been markedly affected by 
adaptation to the shock. 


Recognition-Report Thresholds 


Means. A comparison of the differences in 
thresholds obtained before and after the learn- 
ing (shock) sessions would be misleading. An 
individual who tends to have “high” thresh- 
olds can have greater absolute differences 
than the individual who exhibits “low” 
thresholds although the reduction between 
prelearning and postlearning thresholds may 
be proportionately the same. The use of 
proportions as scores tends to make such 
differences more comparable. 
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TABLE 1 
MEAN RECOGNITION-REPORT SCORES 
Un- 
c Prep. U»- — prep. | Non- 
ROUP Escare PRED ON- Non- smock 
Escape | ESCAPE 
ESCAPE 
| 
aces —___—_/——_| 
Stimulus syllable 57) | £8 74 .81 .0 
Response syllable -65 55 me | .75 4 
TABLE 2 
Errect OF SHOcK ON RECOGNITION- 
Report THRESHCLDS 
emaness : | : 
SOURCE OF | MEAN | 
VARIATION | SQUARE df F p 
Shock escapability | 105.44 1 | 7.32/ .01 
Shock predictability 5.12 S - et 
Interaction, E X P 6.95 | 1 | .48] 
Indiv. diff. | 14.41 | 82 | | 





Table 1 shows the scores for mean recogni- 
tion-report thresholds for the groups expressed 
as ratios of postlearning thresholds to pre- 
learning thresholds. Bartlett’s test for ho- 
mogeneity of variances (11) was applied to 
these data and indicates that these groups 
are comparable in variance. 

It is noted that the ratios of all groups are 
less than one. The escape-shock groups have 
the lowest scores, indicating the greatest 
reduction between prelearning and postlearn- 
ing thresholds. The nonescape-shock groups 
have the highest scores, showing the least 
reduction in thresholds. 

Shock- and neutral-syllable thresholds. It was 
expected that stimulus generalization effects 
would occur to some extent. In order to evalu- 
ate these effects, the recognition thresholds 
of shock and neutral syllables were compared 
by means of the analysis of variance tech- 
nique. This treatment is applicable only to 
the data of the predictable shock groups for 
which the specific shock syllables and neutral 
syllables were always the same. 

The analysis of variance of stimulus- and 
of response-syllable thresholds did not yield 
any significant differences between the thresh- 
olds of shock syllables and neutral syllables 
within the escape-shock or within the non- 
escape-shock conditions. It is possible that 
stimulus generalization prevented any differ- 
ence from appearing in the thresholds between 
the shock and neutral syllables within each 


group. 
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Shock predictability and escapability* Since 
there is no significant difference between the 
shock- and neutral-syllable thresholds in the 
predictable shock groups, the mean thresholds 
for the entire list of syllables can be compared 
for the predictable shock and the unpredict- 
able shock groups. 

The factorial arrangement of the shock 
groups allows the use of the double classifica- 
tion analysis of variance method (7) to evalu- 
ate the factors of predictability (predictable 
and unpredictable shock) and of escapability 
(escape and nonescape shock). 

This technique applied to the stimulus 
syllables yields an F of 1.91 for predictability, 
an F of 1.36 for escapability, and an F of 2.37 
for the interaction of predictability and es- 
capability. None of these F’s is significant. 

Table 2 gives the results for the response 
syllables. It may be seen that the predictability 
factor yields no significant difference. But the 
escapability factor shows a difference which 
is significant (p = .01). The hypothesis that 
there is no difference between these groups 
except that due to chance can be rejected 
with confidence. The nonescape-shock groups 
show higher threshold ratios than the escape- 
shock groups. 

Ignoring the factor of predictability (shown 
to be insignificant) and regrouping the data 
of the shock groups allow comparisons among 
these three groups: the escape-shock group, 
the nonescape-shock group, the nonshock 
group. A significant / of 2.75 (p < .01) appears 
for the difference between the threshold ratios 
of the nonescape-shock group and the escape- 
shock group, with the former showing the 
higher thresholds. A ¢ of 2.03 (p < .05) is 
obtained for the difference between the thresh- 
old ratios of the nonescape-shock group and 
the nonshock group. But there is a ¢ of only 
42 (not significant) for the difference between 
the thresholds in the escape-shock and the 
nonshock conditions although the thresholds 
of the former appear to be lower than those 
of the latter. 


Interpretation of Shock 


Each S who experienced shock during the 
learning task was asked in the interview to 


*The term escapability is cumbersome but is used 
in the interest of clarity. It refers to the ability and 
inability to escape from the shock. It should be clearly 
understood that the shock varied in escapability but 
was unavoidable for all shock Ss. 
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state his interpretation of the shock. These 
data can be classified in the broad categories 
of punishment, emphasis, and interference. 
The category of punishment includes interpre- 
tations of the shock as punishment for error, 
for failure to respond quickly, for failure to re- 
spond at all—and as nonspecific punishment. 
The category of emphasis refers to the stressing 
of certain syllables. The category of inter- 
ference includes interpretations of the shock 
as deliberate attempts on the part of E to 
distract, disrupt, or interfere with S. 

In order to compare the interpretations of 
the escape-shock Ss with the nonescape- 
shock Ss, the data were divided into two groups 
categorized as “punishment” and “other.” 
The latter category combines interpretations 
of emphasis and interference. The results 
were evaluated by the chi-square method. 
Table 3 indicates that the obtained differences 
are highly significant. The Ss in the learning 
conditions which enabled them to escape from 
the shock tended to interpret the shock as a 
kind of punishment. The Ss of the conditions 
which did not permit escape from the shock 
tended to interpret the shock as something 
other than punishment, and the greatest 
proportion of them felt the shock was a 
deliberate interference by E. 


Learning 


The degree of learning of nonsense syllables 
was measured by the number of correct 
anticipations of the response syllables during 
the twelfth (final) trial. Since the shock was 
not specifically related to correct or incorrect 
responses, these results are not considered to 
be related directly to the recognition thresh- 
olds. Nevertheless, it may be stated paren- 
theticaily that the learning task proved to be a 
difficuit one for all Ss. The nonshock group 
achieved a higher degree of learning than the 
escape-shock group, but the difference is not 
significant. The escape-shock group learned 
significantly more than the nonescape-shock 
group. Predictability of shock produced only 
chance differences. The obtained differences 
are attributed to the effects of the escapability 
factor. 

The analysis and discussion of the results 
of the learning condition will be considered 
in detail in another article. For the present, it 
is noted that the number of trials—which were 
limited by the intensity and the number of 
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TABLE 3 
INTERPRETATION OF SHOCK 
PunIsH- | 
Group | weer OTHER x? p 
Escape-shock | 19 9* 
| 15.16 | .001 
Nonescape- 5 23°* | | 


shock 


* Interference 1; emphasis 8. 
** Interference 18; emphasis 5. 


shocks—yielded a relatively low degree of 
learning in this experiment. 

The learning which is relevant to the 
thresholds in this experiment is concerned with 
learning the “‘pattern”’ of shock (i.e., whether 
shock was administered only with certain 
syllables or was given randomly) and with the 
decrease in latency of pronunciation of the 
response syllable. Data concerning the latter 
would provide direct and independent evi- 
dence about the latency of recognition in the 
learning situation. Unfortunately, due to 
practical limitations, Z was unable to obtain 
such data concerning the latency of pro- 
nunciation. 

However, estimates of learning of the 
shock “‘pattern” were obtained. No S learned 
the “pattern” of shock, i.e., no S was able to 
correctly state that the shock accompanied 
only certain syllables or was randomly ad- 
ministered. These estimates were obtained 
from the interview data. 


Qualitative Observations 


Reaction to shock. The range of reactions to 
the repeated administration of shock was 
great. Although relatively few Ss admitted 
feelings of anxiety, sudden, overt, and profuse 
perspiration was a common occurrence. Be- 
havior, such as mispronunciation of the syl- 
lable, frequent clearing of the throat, shuffling 
of feet, and voice tremor, was also common. 

Another kind of behavior which merits 
comment appeared to be a kind of “dissocia- 
tion” from the shock. For example, one S 
held her right arm (electrode was on the right 
wrist) extended in an awkward position at 
shoulder level during the entire period of 
learning. She made no attempt to lower her 
arm and, when questioned, appeared to be 
unaware of this behavior. 

Another factor ~hich emerged from the 
interview data is that every S, without ex- 
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ception, expressed the belief that the shock 
was included for a definite purpose and that 
the purpose could not be harmful to S because 
that would not be socially permissible. The 
reaction to the shock was therefore tempered 
by such an attitude. 

It is also noteworthy that several Ss were 
reminded of past experiences of punishment 
when they were shocked. 

The tachistoscopic situation. A noticeable 
feature of the tachistoscopic situation was 
the tendency on the part of a number of Ss 
to react with feelings of frustration. Other 
reactions were feelings of negativeness and 
hostility toward E, which may have been 
related to the administration of the painful 
shock. These feelings generally were expressed 
freely during the interview. 


DISCUSSION 


Predictability of shock. Administering shock 
consistently with the same syllables or in 
random manner did not result in significantly 
different recognition-threshold ratios among 
the groups. However, Ss’ failure to learn the 
“pattern” of shock indicates that the pre- 
dictability of shock was not effectively estab- 


lished. It appears that this experiment did 
not provide an adequate test of the shock- 
predictability factor. 

Escapability from shock. The data from the 
stimulus syllables showed only chance differ- 
ences in thresholds for all groups. The escape- 
shock Ss tended to recognize the response 
syllables relatively more readily in the post- 
learning tachistoscopic situation than the 
nonescape-shock Ss. There appeared to be 
only chance differences between the thresholds 
of the nonshock group and the escape-shock 
group. 

The major factor to which the differences of 
recognition-threshold ratios among the shock 
groups can be attributed is escapability from 
shock. The effect of this factor seems to be 
restricted to the response syllable, and it 
seems to generalize from shock syllables to 
neutral syllables. 

The results can be interpreted in terms of 
reinforcement theory as previously presented, 
or in terms of interference or inhibition due 
to shock. 

The inhibitory quality of strong shock has 
frequently been noted. In this study, the dis- 
turbing characteristic of the strong shock is 
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attested by the behaviwr of Ss during the 
learning, by the interpretations of the shock, 
and by the comments elicited during the 
interview. Moreover, the escape-shock group 
was not free from the inhibitory effects of 
the shock. Although these Ss could terminate 
the shock after it occurred, the shock was 
unavoidable at the moment it was adminis- 
tered. 

It can be argued that where the conditions 
allow no escape, shock has an inhibitory effect 
which shows both in the amount of learning 
accomplished during the 12 trials and also in 
the change in the recognition thresholds. 

Assuming that the extent of inhibition varies 
directly with the duration of shock, the 
escape-shock group has more interference 
with the pronunciation response than the 
nonshock group. The nonescape-shock group 
has the most interference, and the nonshock 
group the least interference. If the recognition 
thresholds vary directly with the extent of 
inhibition, it follows that the nonshock group 
should have the lowest recognition thresholds, 
the escape-shock group should have higher 
thresholds than the nonshock group, and the 
nonescape-shock group should have the 
highest thresholds. This is not the case. The 
nonshock group did not obtain lower thresholds 
than the escape-shock group. 

In terms of reinforcement, it appears that 
the recognition and pronunciation of the 
response syllables should have been more 
rewarding for the escape-shock group than 
for the nonshock group, for the shorter the 
delay in pronunciation the shorter was the 
duration of the shock. The nonshock group 
did not have this source of reward for recog- 
nition. However, the recognition thresholds 
in the escape-shock group, need not be lower 
than in the nonshock group, even though 
recognition has been reinforced more in this 
shock group. Although the escape-shock 
group was subject to greater reinforcement, 
it was also subject to more inhibition (due to 
the unavoidable shock) than the nonshock 
group. Thus, any actual differences between 
these gioups resulting from reinforcement 
would be cancelled. 

In short, if shock has an inhibitory effect, 
escape from it must be reinforcing to some 
degree or the change in recognition thresholds 
for the escape-shock group would be less than 
that for the nonshock group. Since this is not 
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the case, the results can probably be explained 
by the interpretation that escape from shock 
is reinforcing and that this tends to overcome 
the inhibitory effects of shock. If this latter 
interpretation is true, then the general theo- 
retical thesis of the study is supported. 

The results show that only the response 
syllables elicited thresholds which were 
significantly different for the various groups. 
This is not surprising in the light of the low 
degree of learning (i.e., number of correct 
anticipations) achieved by Ss. Although 
both the stimulus and response syllables were 
exposed when shock was administered, the 
instructions and conditions were calculated 
to establish, and apparently resulted in, a set 
to react primarily to the response syllable, 
and the stimulus syllable was largely dis- 
regarded. With a greater degree of learning 
one might expect the effects of shock associated 
with the response syllable to be transferred 
to the stimulus syllable since the latter was a 
cue for the response syllable. 

The failure of the predictable shock groups 
to discriminate between shock and neutral 
response syllables within a given list can be 
attributed to the inadequate number of learn- 


ing trials and the intensity and number of 
shocks. The number of items involved would 
contribute to the difficulty of such discrimina- 
tion since half were neutral and half were 
shock syllables. 


Relation to perceptual defense. Previous 
studies have invoked the concept of perceptual 
defense as an explanation of the “raised” 
recognition thresholds that were obtained (2, 
6). In this experiment, one group which was 
subjected to an unavoidable noxious situation 
from which there was no escape showed rela- 
tively high recognition thresholds. Can these 
results be interpreted as indicating perceptual 
defense? 

The perceptual defense concept (6) assumes 
that the organism responds with defense 
reactions to anxiety-provoking stimuli, and 
proposes that delayed recognition (inferred 
from the relatively higher thresholds) is one 
form of such defense. This assumption is fully 
acceptable to the theoretical context of this 
study; i.e., responses elicited by the desire 
to escape from the pain may be considered in 
terms of defense of the organism. It must be 
noted, however, that this experiment presented 
all shock Ss with the same noxious stimulus, 
providing some of them with the means of 
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escaping from the pain, and preventing escape 
from the pain by others. If one accepts the 
assumption that noxious stimuli evoke defense 
reactions, it follows that both the relatively 
higher thresholds of the group which was not 
allowed to escape from the pain, and the rela- 
tively lower thresholds of the group which 
could terminate the shock may be considered 
as defense reactions. No valid reason is seen 
for restricting the notion of defense to the 
nonescape Ss. 

In terms of this study, it is not necessary to 
conceive of perceptual defense as active 
repression by the organism. On the contrary, 
relatively high thresholds may be considered 
as resulting from a failure to establish a 
response which can effectively decrease the 
anxiety (or pain) associated with the perceived 
stimulus situation. Conversely, stimuli previ- 
ously associated with such anxiety may yield 
relatively low thresholds if recognition has 
been rewarded by escape from the anxiety. 

Other variables. Although the effects of 
frequency of presentation were relatively well 
controlled (i.e., all Ss received the same num- 
ber of learning trials of material with which 
they had had no previous direct experience), 
there was no control of the frequency of letters 
in the initial and final positions in the syllables. 
No quantitative data were obtained in regard 
to this factor, but observation of Ss left a 
definite impression that it was important in 
determining presolution responses. 

The most obvious example was seen in 
reaction to the syllable, XAT. Very frequently 
Ss, after responding correctly to this syllable 
once, expressed doubt concerning their re- 
sponse and changed it at the next exposure 
of the syllable. It is suggested that such re- 
actions are related to the relatively infrequent 
occurrence in the English language of words 
beginning with X. 

Although the ratings indicated that minimal 
adaptation to shock occurred within the groups 
in this experiment, some adaptation appears 
to have occurred, and it may be attributed to 
the fact that attention was not focused on the 
source of the pain (as shown by Ss’ behavior 
and comments) during the learning trials as it 
was during the ratings. 

Certain personality characteristics tend to 
result in higher recognition thresholds in a 
tachistoscopic situation. regardless of the 
effect of other variables such as shock. In the 
present situation, it was very evident that 
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individuals who expressed doubt concerning 
their accuracy of perception and appeared to 
lack confidence in their ability to accomplish 
the task, or who did not like to guess, were the 
ones with the highest recognition thresholds 
(prelearning thresholds as well as postlearn- 
ing). A strong impression was gained that a 
high positive relationship exists between 
these characteristics and high recognition 
thresholds. 
SUMMARY 


The purpose of this study was to test 
deductions from reinforcement learning theory 
concerning visual recognition thresholds. The 
general hypothesis was that the recognition 
of nonsense syllables previously associated 
with electric shock will occur more readily 
if the experimental conditions enable S to 
escape from the shock than if escape is not 
possible. 

The sample consisted of 70 undergraduate 
college students equally divided into five 
groups. Strong shock was administered to 
S while learning a list of paired nonsense 
syllables by the anticipation-prompting 
method which necessitated the pronunciation 
of all response syllables. Four groups were 
used in a factorial design to investigate the 
factors of shock “predictability” and “‘es- 
capability.” Pronunciation of the response 
syllable was followed by cessation of shock 
in the escape-shock group. The nonescape- 
shock group endured the shock for the entire 
duration of the syllable’s exposure on the 
memory drum. In the predictable s!,ock group, 
the shock consistently accompanied the same 
syllable-pairs and the remaining syllables were 
free from shock. In the unpredictable shock 
group, shock was administered in random 
order. A nonshock group was also included. 
Recognition thresholds of the same syllables 
were tachistoscopically determined prior to, 
and after, the learning. 

The statistical analysis indicates that the 
factor of escapability from shock was a sig- 
nificant factor in the determination of the 
postlearning-recognition thresholds of the 
response syllables. The escape-shock group 
evidenced relatively lower thresholds than 
the nonescape-shock group. No significant 
differences were found between the thresholds 
of the escape-shock and the nonshock groups. 

Although the inhibitory effects of the strong 
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shock complicate the situation, these results 
are interpreted as tending to support the 
proposed hypothesis. 

Relationship of the findings to the concept 
of perceptual defense and other variables is 
discussed. 


CONCLUSIONS 


The following conclusions are drawn from 
the results of this study. 

1. It is suggested that deductions based 
upon the principles of reward learning theory 
can effectively predict differences in visual 
recognition thresholds. 

2. The results suggest that the factor of 
escapavility from shock in a noxious situation 
is a significant determinant of subsequent 
recognition thresholds of stimuli associated 
with that situation. 

3. The experiment did not provide the 
conditions requisite for the adequate investiga- 
tion of the effect of shock “predictability” 
during a learning task upon subsequent 
recognition thresholds of stimuli associated 
with that situation. 
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THE LACK OF GENERALITY IN DEFENSE MECHANISMS AS 
INDICATED IN AUDITORY PERCEPTION! 


SHABSE H. KURLAND 
Columbia U niversity® 


N THE basis of the many studies of 
() perceptual behavior which have ap- 
peared in the literature (2), one can 
conclude that the “emotionality” attached to 
a stimulus may have a significant effect on the 
recognition threshold at which the name of a 
stimulus object can be correctly reported. 
Thus for some subjects (Ss) the threshold is 
lowered (vigilant behavior) and for others it 
is raised (defensive behavior) as a function of 
the “emotionality” of the stimulus (3). One 
theory suggested to account for this observa- 
tion holds that the direction of change in 
threshold is a function of the perceiver’s ad- 
justing to the anxiety aroused bv the stimulus 
objects. It has been further proposed that this 
adjustive behavior can be related significantly 
to “defense” mechanisms characteristic of an 
individual’s behavior in nonperceptual situ- 
ations. From this line of reasoning it has been 
deduced that those Ss who typically use intel- 
lectualizing defenses (obsessive-compulsive) 
and are thereby able to cope freely with emo- 
tional stimuli should show vigilant perceptual 
behavior; and those Ss who use repression 
(hysterical) as the typical defense should con- 
tinue to avoid anxiety-arousing stimuli and 
react with defensive perceptual behavior (7). 
Lazarus, Eriksen, and Fonda, in testing this 
hypothesis, found that such a relationship 
does occur and that patients with intellectualiz- 
ing mechanisms perceive threatening material 
with significantly greater accuracy than those 
with repressing mechanisms (7). 
The logic of their reasoning implies that S 


1 This study is based on a doctoral dissertation com- 
pleted at Columbia University. The writer wishes to 
express his appreciation to Drs. Edward Joseph Shoben, 
Jr., chairman of the dissertation committee, and 
Laurance F. Shaffer, Lincoln E. Moses, and Paul E. 
Eiserer for their invaluable assistance. 

An abstract of this article was read at the annual 
meeting of the American Psychological Association, 
Washington, D. C., September 1952. 

2? Now at Veterans Administration Regional Office, 
Baltimore, Maryland. 
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uses one typical kind of defense mechanism 
whenever he is confronted with anxiety re- 
gardless of the nature of the situation or the 
source of the threat. In other words, knowing 
an S’s adaptive response to anxiety in one 
situation, one can predict his response in other 
anxiety-arousing situations. Evidence from 
both the clinica! study of patients (10) and 
from experiments calls this generalization into 
question. For example, Goldstein, in studying 
the consistency of defense mechanisms in 
normal Ss, found that while there was a small 
group of Ss who showed such a consistency, 
there was a much larger number who used 
different mechanisms in relation to different 
impulses (4). Belmont and Birch, who studied 
learning, recall, and relearning using nonsense 
syllables some of which had been associated 
with shock, found no consistency in Ss’ use 
of repression in the various tasks employed (1). 
It would therefore seem worth while to re- 
examine the hypothesis of a relationship be- 
tween Ss’ perceptual mechanisms when they 
are confronted with anxiety-arousing stimuli 
and the type of mechanism they typically use 
to cope with anxiety in nonperceptual situ- 
ations. 

If the assertion that defense mechanisms 
are general is valid, then one would expect the 
perceptual behavior of patients characterized 
by their therapists as using obsessive-compul- 
sive mechanisms (intellectualization and simi- 
lar defenses) to differ from the perceptual 
behavior of patients who are characterized as 
using hysterical mechanisms (repression and 
avoidance). Specifically, the obsessive-com- 
pulsive group should tend to perceive emo- 
tional words at lower intensities than the 
hysterical group. 

METHOD 

Stimulus words. About 150 words selected from the 
literature on perception and word association were 
given to six judges to be sorted into three categories 


(neutral, emotional, and undecided) on the basis of 
their conception of how a group of neurotics would 
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TABLE 1 
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TABLE 3 


NEUTRAL AND Emorionat Stmmutus Worps Usep THE MATCHING or PATIENT AND Normat Grovups* 











NEUTRAL STIMULI EMOTIONAL STIMULI 





Kill* 
Raped* 
Agony 
Failure 
Death* 
Vagina 
Intercourse 
Suicide 
Cheat 
Breast* 
Crazy 


Robin 
Stone 
Paper 
Curtain 
Music 
Broom 
White 
Table 
Chair 
Trade 
Shoe 
Rug 
Dog 
Grass 
Book 
House 
Radiator 
Hat 
Ocean 
Bell 
Farm 
Shade 
Lamp 
Pencil 


Fight 
Penis 
Love 
Blood 
Anger* 
Sex 


Osses-| Com- 
SIVE- | BINED 
Com- Pa- 


PULSIVE! TIENTS 





4 14 18 
11 8 19 
32 26 27 
115 
Median years of 

education 12 13 12 12 




















* Words on which there was 83 per cent agreement among 
judges. On all other words there was 100 per cent agreement. 


TABLE 2 

Tue Turee Lists 
Robin 
Kill 
Stone 
Raped 
Agony 
Paper 
Curtain* 
Music 
Failure 
Broom* 
Death* 
Vagina 
White 
Table 
Intercourse 
Suicide 


RDS 


Mourn* 
Radiator 
Flunk 
Hat* 
Fight 
Ocean 
Bell 
Penis 
Farm 
Love 
Shade 
Blood 
Lamp 
Anger 
Pencil 
Sex 


Homosexual 
Shoe 
Rug 
Dog 
Kiss 
Grass 
Worry 
Orgasm 
Book 
House 
Blame* 





* Omitted from calculation of results for all Ss since they were 
frequently reported as not heard or incorrectly heard. 


react to them. By this operation, 24 neutral and 16 
emotional words on which there was unanimous agree- 
ment and eight emotional words on which five judges 
were in agreement and one was doubtful were obtained. 
The 48 words were then divided into three lists. Each 
consisted of eight neutral and eight emotional words, 
tape recorded at intervals of three seconds. Using this 
master recording, it was possible to replay each list 22 
times, decreasing the attenuation first in 11 one-decibel 


* No significant difference exists between hysterics and obses- 
sive-compulsives or between combined patient and normal groups 
on any variable. The run test was used except on the variable of 
sex, where a chi-square test was employed. In spite of the lack 
of —— significance, sex must be regarded as essentially un- 
controlled. 


steps and then in 11 two-decibel steps. Thus, a record 
was made on which each of the three lists was repeated 
identically 22 times at increasing degrees of loudness, 
starting from a level where the words were not audible 
to a point where they were clearly perceptible. Recog- 
nition thresholds for each word were thereby obtained 
by an ascending method of limits. 

Subjects. A patient group consisting of 37 persons 
was selected.* Fifteen were psychiatrically judged as 
using predominantly repressive mechanisms and 22 as 
using obsessive-compulsive mechanisms for handling 
anxiety. The therapist who was seeing the patient for 
intensive therapy three times weekly made the decision 
whether or not the patient satisfied the criterion. A 
nonpatient group of 21 individuals who had never had 
any psychiatric or psychological treatment and were 
apparently functioning satisfactorily was also tested. 
As indicated in Table 3, there were no significant dif- 
ferences between patient and normal groups with re- 
spect to age, sex, intelligence, or education. 

Procedure. The patients reported to the psychology 
office of the hospital and were asked to participate in a 
psychological test which had been recorded on a tape. 
They were asked to repeat just what they heard. The 
normal Ss were told that the examiner (EZ) was work- 
ing on a research project in clinical psychology. After 
the introduction, the procedure was identical for all 
Ss, and E’s instructions to report what was heard and 
to guess if the sounds were not clearly perceptible were 
repeated on the tape of stimulus words. The patient 
listened to the recording with a pair of earphones. Be- 
tween the first and second and the second and third 
lists, a short rest period was given. Using a stopwatch 
and a prepared time schedule, E always knew to what 
word S was responding or failing to respond. All re- 
sponses were recorded. 





3 The patients were all hospitalized in Hillside Hos- 
pital, Glen Oaks, Long Island, New York. The author 
wishes to acknowledge his debt of gratitude to Drs. 
Joseph S. A. Miller, Medical Director of the hospital 
and Milton S. Gurvitz, Chief Psychologist, and to the 
other staff members who cooperated in this project. 
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"At the conclusion of the auditory experiment, the 
Hillside Short Form of the Wechsler-Bellevue Intel- 
ligence Scale was administered (5). Other relevant data 
such as age and education were also obtained at this 
time. 

RESULTS 

The criterion threshold values used in com- 
putation are based on the decibel level at which 
S made a correct response to the stimulus 
word twice successively, the lower of the two 
values being selected.‘ For each S a perceptual 
score was then computed by subtracting the 
medial threshold value for S’s neutral words 
from the median threshold value for his emo- 
tional words. To control for the variations in 
range of recognition thresholds among Ss, 
this difference in medians was divided by the 
average of the interquartile ranges for the 
neutral and emotional word lists. The per- 
ceptual score, because of this use of the dif- 
ference in medians for critical and neutral 
lists, does not reflect individual differences in 
auditory acuity, attention, or similar factors 
which may raise or lower the threshold in- 
dependently of the nature of the stimulus word. 
Since the perceptual scores of one group of Ss 
are compared with those of a different group, 
it was not necessary to equate the two word 
lists in terms of word length, word frequency, 
or phonetic structure. These factors are con- 
stants for both groups and cannot, therefore, 
contribute to any difference in perceptual 
scores which may exist for the two groups. 

It was hypothesized that if there is gener- 
ality of application of defense mechanisms, 
then the perceptual scores of the obsessive- 
compulsive group would be lower than those 
of the hysterical group. Using the Mann- 
Whitney U test for data not normally dis- 
tributed (8),5 the null hypothesis of no dif- 
ference in perceptual scores for the two patient 
groups cannot be rejected. Also, there are no 
significant relationships between perceptual 
scores and sex (run test®), age, intelligence, or 
education (rank-order correlation, Table 4). 


4A table of threshold values for each S has been 
deposited with the American Documentation Institute. 
Order Document No. 3982 from the ADI Auxiliary 
Publications Project, Photoduplication Service, Library 
of Congress, Washington, D. C., remitting in advance 
$1.25 for photoprints or $1.25 for 35 mm. microfilm. 
Make checks payable to Chief, Photoduplication 
Service, Library of Congress. 

5The Mann-Whitney U test is designed to test 
whether one population has a larger mean than an- 
other. It is a nonparametric test based on ranks. 
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These negative results suggest a lack of re- 
lationship both between obsessive-compulsive 
mechanisms and vigilant perceptual behavior 
in the auditory mode and between hysterical 
defenses and repressive perceptual responses, 
at least to auditory stimuli. The question 
remains, however, whether or not patients 
differ in their perceptual behavior from Ss who 
are not psychiatrically deviated. If the patients 
are characterized by a greater degree of anxiety 
than the “normals,” and if vigilance and de- 
fense are common forms of coping with anx- 
iety, then the patients should show a greater 
incidence of these forms of perceptual response 
than the nonpsychiatric Ss. 

To attack this problem, the two patient 
groups, since they did not differ in their per- 
ceptual scores, were combined and compared 
with the normal group. Neither the Moses 
test (9)? nor the run test (8)* demonstrates 
that the patients react vigilantly or defensively 
with greater magnitude than the normals. 

The perceptual scores, however, are sig- 
nificantly lower (p < .01) for the patients 
than for the normals, thus indicating that the 
hysterical as well as the obsessive-compulsive 
patients were more vigilant in their auditory 
perceptual reaction to the emotional words 
than were the normal Ss. 

A comparison of the recognition thresholds 
for each of the emotional words for the patient 
and normal groups was also made. First, indi- 


The actual medians of the perceptual scores for 
the various groups were as follows: obsessive-com- 
pulsive patients, —.08; hysterical patients, —.27; 
normal Ss, .00. Thus, the obtained median of the 
threshold intensities for the hysterical group is lower 
than that for the obsessive-compulsive group, a dif- 
ference in the direction opposite to that which was 
hypothesized. By sign test logic, therefore, the results 
obtained in this experiment cannot be accumulated with 
those of Lazarus, Eriksen, and Fonda (7) to support 
the idea of generality in defense mechanisms; they are 
actually opposed to the hypothesis tested. 

7 The Two-Sample Test or Moses test is designed to 
test the null hypothesis that two groups come from a 
common population against the alternative that the 
scores in one are congested at both the upper and 
lower extremes of the distribution of scores. The test 
is nonparametric and can reject the null hypothesis 
even if the means of the two groups do not differ. 

8 The run test is used to determine if two groups are 
drawn from a common population and can reject the 
null hypothesis no matter how the populations differ. 
Since it is sensitive to all kinds of differences (disper- 
sion, median, skewness, etc.), it is not powerful against 
any single alternative. 
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TABLE 4 


RANK-ORLER CORRELATIONS FOR PERCEPTUAL SCORES 
with AGE, INTELLIGENCE, AND EpUCATION 











OBSESSIVE- 
CompPvt- 


SIVES NorMALS 





-04 \ 
-04 
34 


35 
15 
21 








vidual variation in acuity among Ss was con- 
trolled by expressing the thresholds of each 
S for each emotional word as a deviation from 
his own median threshold on the neutral list. 
The corrected values were then compared for 
the two groups for each word (Mann-Whitney 
U test). Seven of the 21 emotional words 
showed a significant difference (p < .02) in 
thresholds between the combined patient 
group and the normal group (agony, suicide, 
worry, homosexual, love, orgasm, and vagina). 

The effect of age, education, and intelligence 
upon the recognition thresholds for the normal 
Ss was studied, and no significant relat-onships 
were found. Normal female Ss, however, 
showed both vigilant and defensive perceptual 
behavior of greater magnitude than normal 
male Ss. 


DISCUSSION 


From these results, there is no support for 
the hypothesis that membership in obsessive- 
compulsive hysterical categories is related to 
vigilant and defensive modes of perception in 
the auditory mode. That such a phenomenon 
as vigilant perception does occur, however, 
is indicated by the fact that the median thresh- 
olds of our Ss are significantly higher in the 
normal group than in the patient group. Since 
these two groups are equated for sex, intel- 
ligence, education, and age, and since the 
statistical procedures controlled for variations 
in acuity, the obtained difference may be at- 
tributed to the complex of factors involved 
in an emotional disturbance severe enough to 
require hospitalization. This result was not 
predicted, however, so this suggested rationale 
amounts to only an untested hypothesis. 

Although the data give no support to the 
hypothesis of generality in the application of 
defense mechanisms, they may be analyzed 
in another fashion to eliminate the possibility 
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that a more limited type of relationship be- 
tween adjustive mechanisms in perceptual 
and nonperceptual situations may obtain. 
The stimuli in the emotional word list repre- 
sent a variety of areas of anxiety. It may be 
that S uses a generally characteristic response 
to anxiety dealing with words representing 
one particular area of anxiety, whereas with 
words from other areas, he may use other types 
of perceptual mechanisms. 

To explore this possibility, those emotional 
words which could be so classified were divided 
into three specific areas: sex, aggression, and 
social failure.* The median recognition thres- 
holds, after correction for acuity, were calcu- 
lated for each patient in the two groups for 
each of the three areas. Again, the two groups 
do not differ significantly in their recognition 
thresholds for either of the three groups of 
words, indicating that in this experiment there 
was no relationship between membership in 
the obsessive-compulsive or hysterical group 
and the auditory perceptual behavior studied. 
Thus, the present study supports the notion 
of specificity rather than generality of de- 
fense mechanisms. 

One possible rationale to explain the finding 
that the combined patient group was more 
rapid (“vigilant”) in its perceptual reaction 
to the emotional words is based on word fre- 
quency. It is known that there is a significant 
relationship between recognition thresholds 
and the frequency or popularity of a stimulus— 
the higher the frequency, the lower the thresh- 
old (6). Since the patient groups were being 
seen for intensive psychotherapy and were 
living in a hospital environment, the frequency 
of emotional stimuli may have been artificially 
raised. No such factor was operating in the 
nonpatient group. This greater exposure to 
emotional words could result in a lowering of 
the thresholds for the patients and thereby 
account for the finding of vigilant behavior 
in this group. 

An alternative explanation is that patients, 
without regard to diagnostic category, typically 
show vigilant behavior as an adjustive response 
to the anxiety that may be associated with 
auditory stimuli. Whether this explanation is 
more reasonable than one based on frequency 


®The words in each area were as follows: sex— 
vagina, intercourse, breast, homosexual, kiss, orgasm, 
penis, love, and sex; aggression—kill, fight, blood, and 
anger; social failure—failure and flunk. 
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cannot be determined without additional 
study. 

In view of the conflicting results in the 
literature on perceptual defense, it might be 
worthwhile to attempt to resolve the problem 
of the relationship between defense mecha- 
nisms in perceptual and nonperceptual situ- 
ations by using another approach. It would 
seem that a basic requirement in continuing 
this line of investigation is to determine if 
vigilant or defensive perceptual behavior is 
characteristic of certain individuals. If it is 
possible to form groups of vigilant and de- 
fensive perceivers, then one can investigate the 
personality correlates of such behavior and its 
situational and historical antecedents. From 
the obtained results it would be possible to for- 
mulate hypotheses to account for the interre- 
lationships of perception and other variables on 
a sounder basis than has been advanced so far. 


SUMMARY 


Recent studies of perceptual behavior have 
demonstrated that there is a change in recog- 
nition threshold as a function of the “emo- 
tionality” attached to a stimulus. It has been 
proposed that the change in threshold is a 


function of the type of mechanisms used to 
cope with anxiety generally. Since there is 
reason to doubt the generality of this asser- 
tion, this study was undertaken to test the 
hypothesis that those patients who use intel- 
lectualization as the preferred type of mecha- 
nism to handle anxiety will perceive emotional 
words at lower thresholds than those patients 
who use predominantly repressive mecha- 
nisms. 

Twenty-two hospitalized patients using 
obsessive-compulsive mechanisms, 15 patients 
using hysterical mechanisms, and 21 normal 
Ss were presented with neutral and emotional 
words which had been recorded on a magnetic 
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tape with increasing degrees of loudness so 
that the recognition thresholds could be ob- 
tained using an ascending method of limits. 

There was no difference in the perceptual 
recognition thresholds for the two groups of 
patients. The combined patient groups, how- 
ever, perceived the emotional words at sig- 
nificantly lower thresholds than the normal 
Ss. It is concluded that further study is neces- 
sary before one can accept with confidence 
the assertion that an S will respond to anxiety 
in a perceptual situation with the same type 
of defense that he employs in a nonperceptual 
situation. 
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THREAT-EXPECTANCY, WORD FREQUENCIES, AND PERCEPTUAL 
PRERECOGNITION HYPOTHESES! 


EMORY L. COWEN? AND ERNST G. BEIER 


University of Rochester 


@ ONSIDERABLE research interest in recent 
years has centered around the study 
of the influence of threat on percep- 
tion. Though patent conclusions are by no 
means available, the fruitfulness of early 
work is amply demonstrated in the burgeoning 
of research and theoretical contributions in 
this area by clinically as well as experimentally 
oriented psychologists. Behind this state of 
affairs lies a growing recognition of the need 
for a unified psychology (7), a need which has 
found expression more frequently and more 
specifically in terms of the desirability of a 
greater rapprochement between experimentally 
oriented and clinically oriented psychologists. 
The study of perception, as Werner (19) and 
Howie (6) have pointed out, provides an ideal 
common meeting ground for these heretofore 
largely “isolated disciplines.” When, in addi- 
tion to this, we consider an important trend 
within the field of clinical psychology toward 
fuller utilization of traditional scientific 
methodology in testing clinical hypotheses 
(14), the recent widespread interest in the 
study of the motivational aspects of perception 
seems readily understandable. 

Investigations to date, by Shafer and 
Murphy (16), Postman and Bruner (13), 
Eriksen (3), Rosenstock (15), and others, have 
led to a tentative generalization that threat 
tends to disrupt both the accuracy and speed 
of perceptual report. These studies have, 
individually and collectively, evoked stimulat- 
ing criticism (6, 9, 12) with respect to ex- 
planatory theory as well as methodology. 

1 This investigation was supported in part by a re- 
search grant to the senior author from the National 
Institute of Mental Health of the National Institutes 
of Health, Public Health Service. The specific study 
reported here is the second in a series of papers de- 
riving from a broader research program in the area of 
sociopsychological and personality correlates of psycho- 
logical rigidity. 

2 Portions of this paper were presented to the Divi- 
sion of Personality and Social Psychology of the 


American Psychological Association at the meeting in 
Washington, D. C., September 1952. 
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Even more controversial, perhaps, has been 
McGinnies’ attempt to demonstrate the 
existence of autonomic discrimination of threat 
before perceptual report in terms of certain 
objective and qualitative indices of “defense” 
(11). McGinnies’ study has been singled out 
for criticism at both the experimental and the 
theoretical level by Howes and Solomon (5) 
and Solomon and Howes (17). The objections 
which have been advanced may be summarized 
primarily on two major grounds: first, the 
possible intervention of a conscious inhibition 
between initial perception and report of per- 
ception, and second, the fact that the results 
may perhaps be accounted for more parsi- 
moniously by a word-frequency hypothesis 
(i.e., the selected neutral words generally are 
more common ones; hence, seeing them more 
rapidly is interpreted as a function of the 
subject’s [S’s] greater past experience with 
them). 

Recently, several research inquiries have 
been directed at the criticisms raised by 
Howes and Solomon (5, 17). McCleary and 
Lazarws (10) and Lazarus and McCleary (8), 
using a conditioning technique with nonsense 
syllables in order to control for the factors of 
differential word frequencies and inhibition 
of report of threat words, found a significantly 
higher prerecognition GSR deflection for 
shocked than for nonshocked syllables. From 
these data, which are generally consistent 
with the earlier McGinnies findings, they 
conclude that the autonomic responses indicate 
that discrimination is occurring before con- 
scious recognition, a process which they refer 
to as “‘subception.” 

Cowen and Beier (2) and Beier and Cowen 
(1) have used a threat-expectancy situation 
(i.e., alerting Ss beforehand to the imminent 
exposure to threat words) in order to reduce 
the likelihood of inhibition of report. Under 
these conditions they have demonstrated that 
Ss tend to require a greater number of trials 
and more time for correct report of threat 
words than neutral ones. 
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The specific purpose of the present investiga- 
tion is to determine whether delayed perceptual 
report of threatening words will occur under 
conditions of threat-expectancy in an experi- 
mental setup which allows us to examine more 
precisely the operation of possible conscious 
inhibition as well as differential word fre- 
quencies. 


. METHOD 


Fifty-nine college undergraduates, 29 male and 30 
female, served a4 Ss in the present investigation. The 
Ss were tested individually by one of three examiners 
(Es) (two male and one female). The E began the test- 
ing session by reading a list of 71 five-letter words to 
S. Twenty-three of these words were prejudged to be 
more or less threatening in our culture, while the re- 
maining 48 were thought to be essentially neutral. 
The S was informed that he would be asked to decipher 
some of these words later. 

The Ss were then shown a series of booklets, con- 
sisting of 30 carbon copies of a single typed word (in 
capital letters). The 30 copies, which had been typed 
on an electric typewriter, were arranged in order from 
the most blurred to the clearest copy. This technique 
is described more fully elsewhere (2, 20). After two 
practice trials using the words PAPER and NIGHT, 
Ss were asked to decipher eight neutral and eight threat 
words chosen from the initial reading list and presented 
one by one in a randomly selected but fixed order, as 
follows: MAGIC, DIRTY,* TULIP, WHORE,* 
TABLE, PENIS,* CANDY, RAPED,* SUGAR, 
CHEAT,* BREAD, DEATH,* URINE,* PLANT, 
STORE, BITCH.* (Asterisks indicate threat words.) 

In the administration of the test series of words, 
E did not permit Ss to spend more than three seconds 
on any given copy of any of the words. The E recorded 
the number of trials and total time required for correct 
report of each word. Guessing was encouraged and 
all guesses which S made prior to accurate report of the 
stimulus word were recorded. 


RESULTS AND DISCUSSION 


Perception of threat and neutral words. The 
first datum to be examined is the total number 
of trials required for accurate report of threat 
words as compared to neutral ones. For each 
S a difference score was computed based on 
the mean difference im number of trials required 
for accurate report of the two types of words. 
A positive difference indicates that more trials 
were required for the eight threat words than 
the eight neutral ones; a negative difference 
ineans the opposite. Fifty-three of the 59 Ss 
were found to have positive difference scores. 
For the group as a whole, the obtained mean 
difference score was +4.049 trials. In examin- 
ing the departure of this value from a hypo- 
thetical mean difference score of zero (which 
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we would predict if our findings were to be 
explained solely on the basis of chance factors), 
the computed / ratio is 9.38, which is significant 
at beyond the .001 level. An analysis of our 
data in terms of time scores indicates essen- 
tially simiJar results. Since in an earlier study 
(2), a high correlation was found between 
trials and time, we cannot regard the latter 
measure as an independent one. 

Conscious inhibition. Several checks were 
made to study the possible influence of con- 
scious inhibition on the above findings. Post- 
test inquiry with this relatively sophisticated 
group revealed few admitted instances of 
deliberate inhibition. In addition to this, 
however, two statistical checks, based on an 
antecedent logical analysis, were utilized. It 
was reasoned that since the threatening words, 
to a considerable extent, were sexually oriented, 
if conscious inhibition were operating one might 
expect female Ss to show it more than male Ss. 
A t ratio based on mean difference between 
threat and neutral words for male and female 
Ss indicated no differences between the groups. 
As a matter of fact, male Ss actually showed 
a somewhat larger mean difference between 
threat and neutral words, but this difference 
did not approach statistical significance. 

Another way of approaching this same prob- 
lem derives from the assumption that Ss’ sex 
per se is not the crucial factor in conscious 
inhibition. Instead, it can be argued that the 
S-E constellation is a more basic determinant 
of whether or not conscious inhibition of 
response will occur. From this line of reasoning, 
one would hypothesize that Ss who were 
examined by Es of the opposite sex would show 
a greater discrepancy in perceptual response 
between threat and neutral words than Ss 
examined by Es of the same sex. To study this 
variable, the pooled records of male Ss with 
female Es and female Ss with male Es were 
compared to the pooled records of male Ss 
with male Es and female Ss with female Es. 
Once ayain the obtained / ratios indicated no 
significant differences between the groups, nor 
were any major trends toward significance 
observable. 

Word frequencies. A second major criticism 
which has been directed toward this type of 
study is that the “threat”? words have com- 
monly been of smaller word frequency than 
control words, and for this reason, slower 
response could be explained on the basis of 
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the S’s being less familiar with them. Part of 
the logic of a threat-expectancy situation is 
that by reading a list of words to Ss before the 
basic perceptual experiment, actual discrepan- 
cies in word frequencies will tend to be, at 
least situationally, reduced. This of course is 
just an assumption—one which the present 
Es thought desirable to check. 

In order to do this, ‘requencies were obtained 
from the Thorndike-Lorge word-frequency list 
(18) for each of the 16 test words used. Two 
of the threat words which did not appear in 
the list were assigned arbitrary word fre- 
quencies of one. Next, the average number of 
trials required for each word by the group as 
a whole was determined. The range here was 
from a mean of 7.7 trials for BREAD to a 
mean of 22.5 trials for WHORE. If word 
frequency is a crucial determinant of per- 
ceptual response under the present experi- 
mental conditions, one would hypothesize a 
significant negative correlation between it and 
average number of trials required for correct 
report of the test words. The obtained Pearson 
r for this relationship was +.003. Since the 
suggestion has been advanced (5) that a truer 
picture of the relationship may be obtained 
by using the logarithm of the word frequency, 
this was also done. The resulting nonsignificant 
correlation coefficient was —.08. 

Prerecognition hypotheses. Several investi- 
gators have presented data, either objective 
or qualitative, bearing on the prerecognition 
hypotheses of their Ss (3, 11, 13). Such data 
are seemingly useful in attempting to infer 
differential reactions of Ss to threatening and 
neutral words before they are correctly re- 
ported. 

In the present study, a total of 1,133 pre- 
recognition guesses were recorded, 507 for 
neutral words and 626 for threat words. This 
difference, while statistically significant, dis- 
appears when the variable of number of trials 
for threat and neutral words is partialled 
out. 

Each prerecognition hypothesis was classi- 
fied independently by two judges* in terms of 
five content categories, as follows: 

1. Siructural—Responses containing three or four 


letters of the stimulus word in their proper location 
(e.g., place or plane for plant). 





* The authors wish to express their appreciation to 
Judith Hess of the University of Rochester for her con- 
tributions to the analysis of the prerecognition hy- 
potheses. 
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2. Meaningless—Responses without a dictionary 
meaning (e.g., bryny for dirty or caleba for table). 

3. Neutral—Meaningful words, essentially non- 
threatening in nature (e.g., plate for death or wagon for 
magic). 

4. Emotional—Meaningful aggressive, escape, or 
sexual responses (e.g., breast for cheat or phobia for 
penis). 

5. Marginal—A wastebasket category for border- 
line hypotheses, not clearly classifiable either as neutral 
or emotional (e.g., screw for cheat or fairy for penis).‘ 


Agreement on classification between the 
two judges exceeded 94 per cent. Most of the 
disagreements occurred in assignments to the 
marginal category. Following the original 
independent ratings in order to facilitate sub- 
sequent statistical analysis, the judges dis- 
cussed items for which agreement had been 
lacking, arriving thereby at a final assignment 
of each item to one of the five content cate- 
gories. Of the total of 1,133 responses 28.3 
per cent were classified structural, 2.4 per cent 
meaningless, 49.6 per cent neutral, 11.7 per 
cent emotional, and 8.0 per cent marginal. An 
over-all] chi-square test, comparing the fre- 
quencies of prerecognition guesses in the five 
content categories for threat and neutral 
stimulus words, indicates that the observed 
frequencies are close enough to the expected 
frequencies to be accounted for solely on the 
basis of chance variation. This analysis was 
extended so that each of the five individual 
content categories could be examined against 
all others. Here, significantly more (p <.05) of 
the prerecognition hypotheses classified as 
structural were given to neutral words than 
to threatening ones. In no other instance was 
a significant difference observed. 

Of the five classification categories used, 
four were based on content, while the fifth, 
the structural, was determined entirely by 
formal properties of the prerecognition hy- 
potheses. For this reason it was possible to 
reclassify structural responses into one of the 
four other categories according to the content 
of the responses. The results of this secondary 
analysis of prerecognition hypotheses, together 
with the chi-square value obtained from tabu- 
lation of observed and theoretical frequencies, 
are presented in Table 1. Further, each of the 
three classification categories was _ tested 
against the other two combined; it was found 
that significantly more neutral hypotheses 

‘ A later impression of the authors was that most of 


the words classified as marginal probably approximate 
emotional guesses more closely than neutral ones. 
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TABLE 1 


Cui-SquarRE Trest BASED ON RECLASSIFICATION OF 
STRUCTURAL RESPONSES* 








CONTENT 


CATEGORY NEUTRAL THREAT TOTAL 





127 85 
(106)** = (106) 
Emotional] and 18 58 76 

marginalt (38) (38) 
Meaningless 15 17 

(16) (16) 

160 160 


Neutral 212 
32 


Total 320 





* x? = 29.51, df = 2, p < 01. 

** Numbers in parentheses indicate expected frequencies. 

t These two categories have been pooled, in accordance with 
the notation in footnote 4 in the text, because the expected fre- 
quencies for the category marginal were extremely small. 


were given for the neutral stimulus words, 
while significantly more emotional and mar- 
ginal guesses occurred in the case of threat 
words. (In each instance p <.01.) 


CONCLUSIONS AND IMPLICATIONS 


The significantly greater number of trials 
required for accurate report of threatening 
words as compared to neutral ones constitutes 
a verification of our earlier finding that per- 
ceptual response appears to be less effectual 
under conditions of threat-expectancy as well 
as direct threat. That these data may not be 
explained on the basis of a word-frequency 
hypothesis is indicated by an absence of cor- 
relation between word frequency and number 
of trials required for accurate report under 
the threat-expectancy condition. Moreover, an 
explanation of the present data in terms of 
conscious inhibition does not appear to be 
entirely satisfactory. For one thing, in the 
threat-expectancy situation, by alerting Ss 
to the impending occurrence of threatening 
stimuli, the credibility of their subsequent 
appearance is increased and their verbalization 


Here the argument could be advanced that the 
significantly greater number of emotional responses 
resulting from our reclassification of structural guesses, 
is simply a function of the fact that S had actually per- 
ceived the threatening word, except for a single letter 
(i.e., rapes or raper for raped). Investigating this pos- 
sibility the authors found a total of 10 such instances. 
In all other cases, while the criterion of three or four 
letters in their proper sequence was met, the emotional 
prerecognition hypotheses differed substantially in con- 
tent from the threatening stimulus word (i.e., peril 
for penis, naked for raped). Even with the 10 suspect 
cases discarded from our analysis, the significance of 
the results presented in Table 1 holds up. 
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during the test series presumably facilitated. 
The absence of admitted instances of deliberate 
inhibition, as determined by posttest inquiry, 
supports this point of view. Further, we have 
been able to demonstrate an absence of dif- 
ferential inhibitics as a function either of the 
sex of the S or sex similarity or dissimilarity 
S-E constellation. 

We interpret the findings of the present 
research as indicative of the probable operation 
of a perceptual defense process. To the extent 
that one is willing to accept as an index of 
defense the significantly higher incidence of 
emotional prerecognition hypotheses for threat 
words than for neutral ones, which was ob- 
served in the reclassification of structural 
hypotheses, this position is strengthened. 

Alerting Ss to impending threat in the threat- 
expectancy situation creates an opportunity 
for innumerable perceptual and behavioral 
adaptations. As a next step in this area of 
research it seems feasible to study system- 
atically correlates of the various adaptations 
made by Ss in terms of other personality vari- 
ables. Understanding the behavioral correlates 
of adaptation to threat is important to both 
the theoretician and the clinician. The fruitful- 
ness of studying individual differences in 
reactivity to perceptual threat has already been 
demonstrated (4). The potential value of 
placing such information at the service of the 
clinician has been suggested (8). 


SUMMARY 


In order to investigate whether delayed 
perceptual report of threatening words would 
occur under conditions of threat-expectancy 
in an experimental setup which makes possible 
examination of the operation of possible 
conscious inhibition as well as differential 
word frequencies, 59 Ss were given a series 
of 16 booklets, each containing 30 carbon 
copies of one five-letter stimulus word, to 
decipher. Eight of these words were considered 
neutral and eight threatening. 

Significantly more trials were required for 
correct report of threat words as compared 
to neutral ones. No correlation between num- 
ber of trials required for correct identification 
of the test words and word frequency was 
found. Several logical and statistical checks 
were used to reduce the likelihood of explaining 
the results on the basis of conscious inhibition. 
The findings are interpreted as consistent with 
a concept of perceptual defense. This interpreta- 
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tion was supported by an analysis of the pre- 
recognition hypotheses. 

Some possible implications for personality 
theory and clinical practice, as well as sugges- 
tions for further research, have been proposed. 
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PREDICTING LANGUAGE BEHAVIOR FROM OBJECT SORTING?! 


LAURENCE S. McGAUGHRAN 
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ESPITE its historical and theoretical 
D significance, the area of conceptual 
behavior is almost virgin territory in 
psychological research. The work which has 
been done in this area has been mainly con- 
cerned with abnormalities in concept formation. 
These studies have generally dichotomized all 
conceptual behavior into “concrete” and “ab- 
stract” levels or “attitudes.” The terms 
concrete and abstract have generally been used 
by Goldstein and others to differentiate be- 
tween the conceptual behavior of normal 
adults and that of such deviant groups as the 
brain-damaged, schizophrenics, and children. 
According to this analysis, an abstract attitude 
may be achieved by normal adults; individuals 
outside this class are characteristically concrete 
in their conceptualizations. 
Goldstein defines concrete and abstract at- 
titudes as follows: 


We can distinguish normally two different kinds of 
attitudes we call the concrete and the abstract. In the 
concrete attitude we are given over passively and bound 
to the immediate experience of unique objects or situ- 
ations. Our thinking and acting are determined by the 
immediate claims made by the particular aspect of the 
object or situation.... [in the abstract attitude] we 
transcend the immediately given specific aspect of 
sense impressions, we detach ourselves from the latter 
and consider the situation from a conceptual point of 
view and react accordingly. Our actions are determined 
not so much by the objects before us as by the way 
we think about them; the individual thing becomes a 
mere accidental example or representative of a “cate- 
gory” (3, p. 6). 


Goldstein (2) clearly states his assumption 
that these attitudes are dichotomous. Others 
(e.g., 1, 4, 9) have attempted to spread out 
the concrete-abstract dichotomy into a con- 
tinuum. However, these attempts have also 


1 This report is adapted from a part of a dissertation 
submitted to the faculty of the Graduate School of 
Ohio State University in partial fulfillment of the re- 
quirements for the degree of Doctor of Philosophy. I 
am indebted to Dr. George A. Kelly, who directed the 
dissertation, and to Drs. Donaid Ramsdell, Julian B. 
Rotter, D. D. Wickens, Ross Mooney, and John W. 
Black, doctoral committee members, for their exten- 
sive encouragement and support during this study. 
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involved research with deviant groups, and 
appear to have been directed toward breaking 
down one member of the polarity or the other, 
but not both in the same terms. In other words, 
there has not been developed a unitary set of 
concepts which may be used to dimensionalize 
an entire range of conceptual behavior in 
general psychological terms. 

The apparent lack of exact meaning for the 
concept, “concrete,” makes it especially dif- 
ficult to dimensionalize conceptual levels or 
attitudes. As originally used by Goldstein 
(see 3) to characterize conceptual behavior in 
brain-damaged patients, concrete attitude 
seemed to mean an inability (or, at least, a 
special difficulty) in breaking down perceptual 
wholes into parts. That is, an object was not 
seen by the patient as a “carrier” of a number 
of attributes, but as a nonanalyzable whole. 
Thus, when the patient grouped objects into 
categories, he did so in such a way that the 
whole-character of the objects was maintained 
(e.g., grouping on the basis of identity instead 
of similarity). Elaboration of this use of the 
term, concrete, in subsequent investigations 
has introduced a number of other criteria for 
concrete behavior which are more or less con- 
sonant with this original observation; these 
have included an inability to “shift” conceptu- 
ally from one attribute to another with the 
same group of objects, a lack of “spontaneity” 
in grouping, and “‘narrow” conceptual limits. 

However, as subsequent studies introduced 
additional experimental groups, such as chil- 
dren, aments, neurotics, and schizophrenics, 
additional criteria for concretism were sug- 
gested which do not seem to correspond with 
the original meaning of the term. Examples 
of these are the conceptual grouping of objects 
on the basis of past personal experience, nar- 
rative sequences, and private symbolic mean- 
ings. 

An analysis of all of the criteria for con- 
cretism cited in the literature suggests that 
there are at least two broad categories of 
conceptual behavior to which concrete concepts 
may be assigned. The first more or less coin- 
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cides with the meaning which is inferred to 
have been originally intended by Goldstein by 
the term, concrete, as discussed above. The 
second category of conceptual behavior to 
which a number of concrete concepts can be 
assigned might be designated as that involving 
“private,” or “nonpublic,” principles of collec- 
tion. Thus, when the individual groups objects 
on the basis of differences in his feelings toward 
them or the symbolic significance they hold for 
him, the observer without foreknowledge of 
these private events is unable to predict the 
limits of the conceptual grouping. Such group- 
ings do not, however, necessarily reflect an 
inability to break down wholes into parts. 

The experimental variable employed in the 
present research is an elaboration of this dis- 
tinction made between two categories of 
concrete conceptualization. 


DEFINITIONS 


To set up a system of analysis of conceptu- 
alization which would “cut through” the 
previously employed abstract-concrete dichot- 
omy, it seemed necessary to make a number of 
distinctions which have not heretofore been 
expressed. The following definitions are set 


forth to convey these distinctions; the terms 
employed are, of course, constructs limited in 
application at the present time to the opera- 
tions of the present experiment. 

1. A concept is a term identifying the prin- 
ciple employed by an individual to collect two 
or more objects (or events) to which he in 
some manner intends to respond similarly. 
The nature of a concept necessarily implies 
that soine other abstractible qualities of the 
single objects are disregarded by the individual 
when the objects are so collected. 

The total of conceptual-group-memberships 
is the sum total of all possible abstractions 
which could be made from objects collected 
within a specific conceptual grouping. 

2. Conceptual freedom is a postulated dimen- 
sion of conceptualization which describes the 
extent to which an individual concept permits 
the potential inclusion of additional conceptual- 
group-memberships within its limits. The degree 
of conceptual freedom is inversely related to 
the complexity (i.e., multiplicity of attributes) 
of the collecting principle for the specific 
concept. (For example, a concept of “white, 
square objects” would by definition exclude 
“white, not-square” and “square, not-white”’ 
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objects, and would, therefore, have a lesser 
degree of freedom than either “white objects” 
or “square objects” as separate concepts.) 
The terminals of this dimension are designated 
“closed” and “open-ended,” respectively. 
Closed concepts tend toward restrictiveness, 
and open-ended concepts tend toward infinite 
freedom, in the variety of objects (and thus ad- 
ditional conceptual-group-memberships) that 
would be included within their limits. 

3. Conceptual extensionality is a postulated 
«..nension of conceptualization which describes 
the extent to which an individual concept 
permits potential public prediction of con- 
ceptual group limits. The degree of conceptual 
extensionality expresses the extent to which 
the principle underlying the individual concept 
is shared and freely communicated by the 
majority of persons within the same culture. 
(For example, the principle for “triangularity” 
is culturally denotable; the principle for 
“attractiveness” is considerably less so.) The 
terminals of this dimension are designated 
“public” and “private,” respectively. 

4. Conceptual area is a broad class of concepts 
grouped together by empirical definition. 
Definition of the limits of a conceptual area 
is an arbitrary procedure based upon the 
setting of a guessed cutting point for two or 
more dimensions of conceptualization to form 
an assumed axis, for the purpose of testing a 
hypothesis concerning the “‘best fit” for the 
axis. The term conceptual area is used to 
contrast the present formulation with previous 
ones using conceptual level as a basis for 
classification. Classification by conceptual 
level represents an attempt tc locate concepts 
at points along a single continuum; classifica- 
tion by conceptual area implies the use of 
two (or more) continua as coordinates to locate 
concepts within a space common to both (or 
to all) of them. Three conceptual areas are here 
hypothesized: hypostatic, autistatic, and 
metastatic (for definitions, see below). The 
first two of these areas are those employed in 
the present experimental groups. 

5. The hypostatic? conceptual area (containing 
“H concepts”) lies within the quadrant be- 
tween the closed and public terminals of the 
dimensions of conceptual freedom and con- 

2 The dictionary definition for hypostasis is a “sub- 
stance, subsistent principle, or essential nature of any- 
thing; a subject in which attributes are conceived to 
inhere.” 





PREDICTING LANGUAGE BEHAVIOR FROM OBJECT SORTING 


ceptual extensionality, respectively (cf. Fig. 1). 
This area consists of those concepts which 
have a relatively low potential for additional 


conceptual-group-memberships and a relatively 


high potential for public prediction of con- 
ceptual group limits. Conceptual groupings 
within this area would be closed and public; 
that is, the complex principle used to group 
the objects would limit the kinds of objects 
that could be included in a group but would 
be readily understood by others. 

H concepts are complex concepts. A com- 
plex concept is a term whose principle of collec- 
tion “freezes” two or more abstractible items 
within the total of its conceptual-group-mem- 
berships into a fixed relationship. The principle 
of collection for an H concept is based upon 
the identity, species-identity, or quasi-identity, 
of the objects within its group. Identity is 
assumed exact replication of objects within a 
conceptual group. A species is a group of ob- 
jects, each of which is “called,” or usually 
identified, by the same name (for example, 
knives, pencils). Quasi-identily is the partial 
identity of objects within a group induced 
through the “freezing” of two or more items 
into a fixed relationship upon which the 
concept is based (for example, “red, round, 
rubber objects”). 

6. The autistatic conceptual area (containing 
“A concepts’’) lies within the two quadrants 
formed by the entire conceptual freedom 
dimension and the private terminal of the 
conceptual extensionality dimension (cf. Fig. 
1). This area consists of those concepts which 
have a relatively low potential for public pre- 
diction and an undetermined potential for 
additional conceptual-group-memberships. (In- 
formal observation in this particular experi- 
mental situation suggests that the quadrant 
of low potential for additional conceptual- 
group-memberships and low potential for 
public prediction would be a relatively small 
area, at least for the population from which 
the present sample was drawn. It might be 
hypothesized that such responses would fre- 
quently be psychotic—or, at least, markedly 
nonadjustive—in nature. This is, of course, 
nothing more than a speculation at this point, 

* The term, “autistatic,” was coined by the writer 
to convey the meaning of a form of conceptualization 
based upon a private or noncommunicated principle 
of collection. 


Fic. 1. REPRESENTATION OF CONCEPTUAL AREAS 
FORMED BY AN ASSUMED AXIS OF THE DIMENSIONS 
or “CONCEPTUAL FREEDOM” AND “CONCEPTUAL 
EXTENSIONALITY” 


since this present experiment was not designed 
to test this hypothesis.) 

7. The metastatic conceptual area (containing 
“M concepts”) lies within the quadrant be- 
tween the public and open-ended terminals 
of the conceptual freedom and conceptual 
extensionality dimensions, respectively (cf. 
Fig. 1). This area consists of those concepts 
which have a relatively high potential both 
for additional conceptual-group-memberships 
and for public prediction.® 


STATEMENT OF OPERATIONS AND HypornHeEsIs 


A method frequently used in studies of con- 
ceptualization (e.g., 1, 3, 9, 10, 11) involves 
the task of sorting commonly known, “every- 
day” objects, such as toys, tools, smoking, 
writing, and eating equipment, etc., into groups 
into which they “belong.” This method, with 
some alterations in administration and scoring 
procedures as discussed below, was adopted 
in the present study to provide a measure of 


* The dictionary definition for metastasis is a “change 
of state, substance, or form.” 

5The concept, metastatic conceptual area, is em- 
ployed in the present experiment only to exclude certain 
individuals within the sample from the experimental 
groups. With some exceptions, the concepts falling 
within this area correspond to those designated as on 
an “abstract level” by previous investigators. 
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the criterion of hypostatic vs. autistatic con- 
ceptualization. 

The field of individual differences in language 
behavior appears to be a rich source of data 
for psychological research, but the results of 
investigations in this area have frequently 
seemed cloudy. One possible reason for this 
may be that the experimental criteria used 
to predict language behavior have most fre- 
quently iacked clear definition (e.g., psy- 
chiatric categories). Since the present study 
attempts a clearer delineation of behaviors 
collected under the term concretism, and since 
language behavior, itself, involves conceptu- 
alization, it seemed particularly appropriate to 
use language analysis as a method to test the 
validity of the separation of concrete con- 
ceptualization in an object-sorting task into 
hypostatic and autistatic areas. 

The hypothesis is formally stated as follows: 
A significant relationship exists between con- 
ceptual performance in an object-sorting situa- 
tion, as expressed by dominant “conceptual 
area,” and language performance, as measured 
by an analysis of language usage in a picture- 
interpretation situation. 


METHODS 


Preliminary Studies 


Four preliminary studies using a total of 35 sub- 
jects (Ss) were conducted to develop administration 
and scoring manuals (7) for the object-sorting situation 
(OSS) as a criterion task and the language-behavior 
situation as a predictor task. The first three of these 
studies were largely empirical explorations of methods 
and measures; the fourth preliminary study was more 
formally designed to choose an adequate measure in the 
sorting task and to eliminate all language measures ex- 
cept those which permitted the most objective scoring 
and which seemed to fit within a coherent rationale. 
To simplify presentati-- ‘ liaausuauon aad 
scunng procedures for both tasks are describei at tis 
point. In actual fact, some alterations to providc tiis 
final form were introduced for both procedures duri1g 
the item analysis phase of the fixed-design experime:it, 
to be discussed below. 

Development of procedures for scoring the OSS. The 
32 objects used in the sorting task corresponded rouglily 
te those used by previous investigators, although 10 
special attempt was made to match them exactly he- 
cause no substantial body of .:ormative data is avsil- 
able to warrant rigidly fixed procedures. The modifiva- 
tions in procedure introduced in the present study wire 
in methods of administration and scoring. As regards 
administration of the OSS, instead of presenting ll 
of the objects to S simultaneously, E handed over each 
object separately and in a fixed sequence designed to 
force a choice between alternative groupings for each 


L. S. McGauGHRAN 





object presentation. When these sortings were com- 
pleted, the objects were shuffled into one pile and S 
was requested to try to resort them in a different way. 
Responses for both phases (sorting and resorting) were 
used in computing the criterion score for each S. 

In scoring the OSS, E categorized each of S’s sort- 
ings in this task as occurring within the hypostatic, 
autistatic, or metastatic areas of conceptualization. 
A list of the subareas of classification employed in mak- 
ing these designations is shown in Table 1, together 
with an example for each class of conceptual grouping. 

The criterion measure was termed the hypostatic- 
autistatic index (H-A index). It was obtained for each 
S by subtracting the percentage of his autistatic con- 
cepts from the percentage of hypostatic concepts when 
either figure was higher than his percentage of meta- 
static concepts. Those Ss who did not meet this cri- 
terion (that is, Ss whose percentage of metastatic con- 
cepts exceeded the percentage of concepts in each of the 
other two areas) were excluded from the present study. 

Development of procedures for the analysis of language 
behavior. Data from the language-behavior situation 
consisted of narrative interpretations to a series of 12 
pictures presented in sequence to each S.* Instructions 
to S for this task corresponded, in general, to those 
usually employed in administering the TAT. These 
interpretations were electrically recorded and tran- 
scribed to typewritten protocols; each protocol was 
proofread at least twice before the language analysis. 

To explain the rationale used to develop and to group 
the language measures, it seems necessary to discuss 
briefly the general behavior tendencies which were as- 
sumed, a priori, to be associated with regnant hypo- 
static and autistatic conceptualization. When the indi- 
vidual tends to require complete identity as the basis 
for his sorting of objects in the OSS, he maintains a 
high degree of order and precision in his conceptual 
organization. His conceptual distinctions are based 
upon a fine discrimination of perceptual dissimilarity 
between one complete object and another (that is, 
between one complex of conceptual-group-member- 
ships and another); this efficiency in conceptual or- 
ganization is maintained so long as the parameters 
of conceptual-group-memberships remain relatively 
“fixed,” but it tends to break down when changes or 
ambiguities are introduced into the stimulus field. On 
the other hand, when S consistently uses the objects 
in the OSS simply to “trigger” a series of private as- 
sociations, his conceptual organization may have little 
or nothing to do with the perceptual similazity of ob- 
jects within his groups. The introduction of changes or 
ambiguities, as perceived by £, in the stimulus field 
may be completely disregarded by S unless these 
changes coincide with his own autistic shifts in asso- 
ciation. 

An individual’s conceptual system is his organization 
of his field of awareness. If he consistently employs a 
particular form of conceptual organization under one 





* Nine of these pictures were taken from the TAT 
series (cards 1, 2, 4, 10, 11, 13, 14, 16, and 20). The 
other three were taken from a set prepared by Julian 
B. Rotter, Ohio State University. Card 1 was used as 
a “warm-up” task only; narratives for this card were 
not used in the language analysis. 





TABLE 1 


AREAS AND SUBAREAS OF CONCEPTUALIZATION SCORED IN THE OBJECT-SORTING SITUATION 
witH EXAMPLES OF GROUPINGS AND DEFINITIONS FOR EACH SUBAREA 





AREA AND SUBAREA OF 
CONCEPTUAL GROUPING 


EXAMPLE OF 
Osyects GROUPED 


EXAMPLE OF STATED 
Basis FoR GROUPINGS 








HypostatTic AREA 


Concepts of identity 
a. Absolute identity 
b. Species identity 
c. Identity of function 


Concepts of induced partial 
identity 
a. Direct multiple restric- 
tion 
b. Indirect multiple 
striction 
c. Juxtaposition of simi- 
larity and dissimilarity 
d. Common internal varia- 
tion 
Concepts of situational co- 
occurrence of species 
a. Reciprocal cofunction- 
ality 
b. Specific mediation 


re- 





2 corks 
toy fork, fork 
toy hatchet, toy hammer 


2 sugar cubes, file card 
toy tools, tools 
green square, red circle, file 


card 
pipe, spoon 


pipe, matches 


file card, blotter 


c. Unstructured situation- pencil, eraser 


al co-occurrence 





“‘corks—the same thing”’ 

“forks go together” 

“hatchet and hammer—serve 
same purpose” 


“white and rectangular” 


“children’s tools here—real 
tools there”’ 

“same material and different 
colors” 

“stems with depression at 
base”’ 


“light pipe with matches” 

“write on with need 
blotters” 

“seem to go together” 


ink, 





AUTISTATIC AREA 





Physiognomic concepts 
a. Judgmental 
b. Subjective elaboration 
c. Experiential contiguity 
Concepts of construction and 
design 
Chain and radial concepts 


Representational concepts 
Narrative and conditional 
concepts 


Concepts of implicit and 
loose mediation 
a. Implicit mediation 
b. Loose (synecdochal) 
mediation 
c. Heterogeneous concepts 


d. Nonce concepts 





(several) 

lock, toys 

blotter, cigarettes, matches 

green square, red circle, ball 
(spatially arranged) 

block, sink stopper, eraser, 
red circle (3 groups) 


matches, cork 
2 corks 


tableware, sugar, smoking 
equipment 

matches, toy hatchet 

(several) 


(several) 


“objects that please me”’ 
“keep people from stealing”’ 
“have on my desk at home” 
“fellow with ear muffs” 


“these have projections (bl, 
ss); the stopper also goes 
with eraser because rubber; 
and circle with stopper— 
round” 

“burnt cork for minstrel’? 

“close whiskey bottle with 
one, have another if get 
drunk and lose it” 


“to eat and smoke” 


“matches from trees and ax 
chops trees down” 

“all are made of either wood 
or metal or both” 

“rest over there—don’t know 
why” 








METASTATIC AREA 





Concepts of unitary abstrac- 
tion 

Concepts of superordination 
—nominal 

Concepts of superordination 
—implicit 


red square, ball, and eraser 


several toys 


several toys 


“they’re red” 


“all these are toys”’ 


” 


“all of these you use for play 
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set of stimulus conditions, it might be hypothesized 
that he would show similar tendencies under other 
conditions. This is really nothing more than a specific 
assumption of the existence of relatively general be- 
havior potentials (or traits) within a particular area 
of personality organization.’ On an a priori basis, am 
assumption was made that the individual’s consistent 
use of hypostasis as a principle of conceptual organiza- 
tion might be associated with a more general behavioral 
potential toward “resiity-fixing,” or “stimulus-bound- 
ness.” Similarly, it was assumed that the consistent 
grouping of ebjects in the OSS on the basis of private, 
nonextensional meanings to the individual might be: 
associated with a more general behavior potential! 
toward “autism.” 

If these consistencies in behavior in the sorting ¢~*« 
actually are associated with more general behavior :.- 
tentials of reality-fixing vs. autism, it might be ex-: 
pected, again on an a priori basis, that individuals ander: 
different stimulus conditions would show other 
sistent differences in behavior in more specific terms: 
such as rigidity-fluidity, inhibition-spontaneity, pas- 
sivity-autonomy in reality structuration, vigilance- 
laxness in reality-testing, etc. Such concepts as these, 
lacking precise definition, were used rather loosely as. 
guides in the selection and grouping of the measures of 
language behavior elicited in the picture-interpretation: 
task. It might be added that this projective task seemed 
admirably suited for the present experimeatal problem 
because its purpose is to require an attempt to interpret 
ambiguous stimulus material. 

The 39 frequency measures finally selected for lan- 
guage analysis are shown in Table 2. A brief descrip- 
tion and an illustration are provided for each measure. 
The 39 separate measures were grouped into eight 
classes of common rationale in order to test both specific 
and more general hypotheses of relationship between 
behavior potentials. The specific and general hypotheses. 
for the measures in each class are set forth below in 
terms of the predicted performance for the hypostatic 
group (H group). 

1. Class A (anguage flow). The general hypothesis: 
was that the H group would show less spontaneity,, 
more inhibition in picture interpretation. The specific 
bypotheses for each measure were: (¢) Measure 1— 
tow total verbal production for H group. (5) Measure 
2—low amount of speech interference (i.e., “concept 
rivalry”) for H group. (c) Measure 3—high fluency in 
produced speech for H group. 

2. Class B (maintenance of autonomy). The general 
hypothesis was that the H group would show less 
autonomy in imposing new structure upon the experi- 
mental situation. The specific hypotheses for the sepa- 
rate measures were: (a) Measures 4 and 6—fewer 
“free” questions and fewer remarks out of context of 


mn; 
coG-- 





7A more recent test of this assumption wes =. 122 iy 
Jeffreys (5). He found « low Sui sigs” cant relationship 
betwees uy vostetc vs. autistatic vonceptualization 
iz t-sorting situation, and range of experienced 
movement in the phi phenomenon and “form-bound 
vs. form-labile” performance on the Rorschach test. 
The latter two situations were introduced by Klein 
and Schlesinger (6) to demonstrate the generality of 
“perceptual attitudes” which they term “resistance to 
instabilicy” and “tolerance for instability.” 
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the experimental task by the H group. (6) Measures 
5 and 7—more requests for reassurance, and more re- 
marks protesting difficulty in interpreting ambiguous 
details in the stimulus cards by the H group. 

3. Class C (flexibility of conceptual organization). 
The general hypothesis was that Ss in the H group, 
tending toward description rather than interpretation 
of the narrative cards, would show a greater initial 
precision in their choice of words; that is, their verbal 
responses would have a more deliberative, either-or 
character. The specific hypotheses for the separate 
measures were: (a) Measures 8, 9, 10, and 11—fewer 
slips of speech, and fewer corrections and elaborations 
of verbal constructions once produced by the H group. 
(6) Measure 12—more attempts by the H Group to 
emphasize “psychological distance” between the nar- 
rator and the stimulus card by substituting a modal 
auxiliary (e.g., could be) for a copulative (is) form of 
predication. 

4. Class D (maintenance of narrator role). The general 
hypothesis was that the H group, showing less spon- 
taneity, free imagination, and autonomy in projective 
interpretations, would attempt to maintain psycho- 
logical distance by literalizing the experimental task 
of “making up a story.” The specific hypotheses for 
the separate measures were: Measures 13, 14, and 15— 
more references by the H group to the artificiality of 
the stimulus material and to the fictional nature of 
their narrative responses. 

5. Class E (reality-modifying expressions). The 
general hypothesis was that when the H Group did 
qualify or extend their interpretations, it would be in 
the direction of either greater precision of description 
or less positiveness of an assertion. The specific hy- 
potheses for the separate measures were: (a) Measures 
16, 17, and 18—fewer qualifying terms by the H group 
to convey allness, completeness, certainty, or limitless 
progression. (6) Measures 19 and 20—more qualifying 
terms by the H group to convey exactitude or attenua- 
tion of the “reality” of their story interpretations. 

6. Class F (reality-structuring expressions). The 
general hypothesis was that the H group would show 
less autonomy in structuring the reality of their stories. 
Thus, their protocols should contain more construc- 
tions assessing the likelihood that their interpretations 
coincided with what the stimulus cards “really meant.” 
The specific hypotheses for the separate measures were: 
(a) Measures 21, 22, 23, and 24—more constructions 
by the H group assessing the possible or probable mean- 
ings “resident” in the stimulus card. (6) Measure 25— 
fewer constructions by the H group referring to S’s 
own experience as the source of meaning for the narra- 
tive content. 

7. Class G (resistance to the experimental task). Since 
the number of rejections of stimulus cards (Measure 
26) was the only measure in this class, the only hy- 
pothesis was that the H group would be less able to 
carry out the experimental task of narration with 
cards which presented marked ambiguity of perceptual 
detail. 

8. Class H (summation within-group measures)..The 
13 measures (Measures 27 through 39) within this class 
are simply raw score summations of the results of 
measures within each class listed above. They are added 
to provide a test for the validity of the common ra- 
tionale underlying each general hypothesis. 
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TABLE 2 
Crass, Specific TitLe, DescriPTION, AND ILLUSTRATION OF EACH FREQUENCY MEASURE 
UseD IN THE ANALYSIS OF LANGUAGE BEHAVIOR DURING A PICTURE-INTERPRETATION 
TASK 








Crass, NuMBER, AND SPECIFIC TITLE ILLUSTRATION OF 
ror Each LANGUAGE MEASURE Brier DESCRIPTION OF MEASURE LANGUAGE CONSTRUCTION 





Language flow (Class A) Measures of verbal production and 
fluency 

1. Total verbal production 1. Total number of words used in _— 
interpretation task 

2. Interference noises 2. Nonmeaningful sounds , a a a 

3. Word contractions 3. Contractions increasing fluidity . “This man’s going to. . .” 
of speech 

Maintenance of autonomy (Class B) Measures of verbal behavior outside 
of the context of the narration 
task 

4. Autonomous questions 4. All questions by S except re- . “Which card was it that looked 
quests for reassurance like this one?” 

5. Nonautonomous questions 5. Requests for reassurance or for . “Is that story alright?” 

E’s evaluation at completion of 
story. 

6. Autonomous asides . “Aside” remarks by S not im- . “Such a person always gets by.” 
mediately related to the experi- 
mental task of narration. 

7. Nonautonomous asides . “Aside” remarks by S protest- . “The background is too shad- 
ing some feature of the picture owy, I don’t know what he’s 
or of the narration task. doing.”’ 

Flexibility of conceptual organiza- Measures of S’s correction or elabo- 

tion (Class C) ration of words or phrases al- 
ready uttered. 

8. Permeability . Correction of slips of speech, 8. “The books she hand—has in 
such as a jumbling of words or her hand. . .” 
an “anticipation” of a word be- 
fore its usual order in the sen- 
tence. 

9. Enhancement . Abreakincontinuityinwhicha 9. “...listening to the music—to 
word or a phrase is substan- the tones of the music.” 
tially repeated, but with elabo- 
rations or amendments. 

10. Mirroring . An abrupt shift from a positive . “TI don’t think—I think...” 
to a negative form (or v.v.) 
with retention of essentially the 
same meaning. 

11. Parenthetical constructions . Sequential phrases joined by . “... aspect—rather, attitude 
such connectives as “rather,” mete 
“or,” etc. in which a refinement 
of the original statement seems 
to be intended. 

12. Reality reductions . Sequential phrases in which . “He is—seems to be angry.” 
there is an apparent attempt to 
reduce the positiveness of an in- 
terpretative assertion. 

Maintenance of narrator role Measures of S’s presumed attempt 

(Class D) to maintain “psychological dis- 
tance” through continued refer- 
ences to the “props” of the 
narrating task. 

13. Direct references to stimulus . Usage of the terms, picture, 13. “This picture reminds me...” 

material painting, or card during the in- 
terpretation. 

14. Direct references to narrating . Terms presumably emphasiz- 14. “It could be a scene in which 

terminology ing the fictional nature of the ” 
interpretative content (e.g., 
theme, tale, scene). 
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TABLE 2—ContInveD 


ILLUSTRATION OF 
LANGUAGE CONSTRUCTION 


. “This picture has nothing on if. 


Crass, NuMBER, AND Specrric TITLE 


ror Each LANGUAGE MEASURE Brier DESCRIPTION OF MEASURE 








15. Pronominal references to stimu- 15. Substitution of pronouns with- 


lus material 


Reality-modifying expressions (Class 
E) 


16. Allness expressions 


17. Expressions of certainty 


18. Open-ended expressic1s 


19. Expressions of exactitude 


20. Attenuating expressions 


Reality-structuring expressions 
(Class F) 


. Externalization of narrative 


“reality” (verbal) 


. Externalization of narrative 


“reality” (adverbial) 


. Probability hypostasis (verbal) 


. Probability hypostasis (adver- 
bial) 


. Internalization of narrative 


“reality” 





in the same or a succeeding sen- 
tence for terms scored in 
Measure 13. 


Measures of the usage of terms serv- 


ing to strengthen or weaken 
the positiveness or extensive- 
ness of an assertion. 


. Terms conveying completeness 


or entirety of a narrated event 
(e.g., completely, totally, 
utterly). 


. Terms conveying the certainty 


of a narrated event (eg., 
positively, absolutely, defi- 
nitely). 


. Phrases which serve to “hold 


open” a phrase or clause for 
unspecified additions (e.g., and 
so on, and everything, and all 
that). 


. Terms conveying exactitude or 


singularity in the sense of an 
emphasis upon _ specification 
(e.g., only place, strictly 
speaking, exact time). 


. Phrases which tend to reduce 


or weaken the meaning or 
effect of the words they modify 
(e.g., more or less, a little bit, 
something like). 


Measures intended to estimate the 


degree to which S accepts or 
evades the responsibility for 
the “reality-structure” of his 
narrative. 


. Modal auxiliaries conveying 


fixity of the plot or of the char- 
acterizations as “resident” in 
the stimulus card. 


. Adverbial terms or phrases 


conveying fixity of the plot or 
of the characterizations as 
“resident” in the stimulus card. 


. Modal auxiliaries or other 


verbal forms conveying prob- 
able fixity of the plot or of the 
characterizations as “resident”’ 
in the stimulus card. 


. Adverbial terms or phrases con- 


veying probable fixity of the 
plot or of the characterizations 
as “resident” in the stimulus 
card. 


. Terms or phrases referring to 


S’s own reported experience as 
the focus or medium for his 
interpretation (e.g., I see, I 
associate). 





It does not look...” 


. “He is completely exhausted.” 


. “She is definitely dead.” 


. “She’s trying to comfort him, 


and all that, and...” 


. “It’s the only place it could 


happen.” 


. “He probably is more or less 


wishing that...” 


. “This might be a mother-son 


situation.” 


. “It possibly is a question of . . .” 


. “This picture seems to indicate 


that...” 


. “He probably is the kind of 


fellow who...” 


. “I have an idea that he is going 
Rice" 
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TABLE 2—ConrTInvE 


Crass, NUMBER, AND SPECIFIC TITLE 
FoR EacH LANGUAGE MEASURE 


Brier DESCRIPTION OF MEASURE 


ILLUSTRATION OF 
LANGUAGE CONSTRUCTION 





Resistance to experimental task A measure of the failures to carry 


(Class G) 
story.” 
26. Theme avoidance 26. 


A response to a stimulus card in 
which there is only description, 
without narration or depiction 
of characters (as 


out the task of “making up a 


. “This means nothing—there’s a 
bridge and some kinds of 
animals, but you can’t tell 


behaving- what’s happening.” 


experiencing organisms). 


Raw score summations within groups 
of language measures (Class H) 
of the 


Measures based upon raw score 
(frequency count) summations 
foregoing measures 


grouped together on the basis 
of a common rationale. 


. Autonomous balance—questions 
. Autonomous balance—asides 
. Autonomous balance—total 

ie 


. Conceptual fluidity 


. Measure 4 minus Measure 5. 
. Measure 6 minus Measure 7. 
. Measures 4 plus 6, minus 5 plus 


. Summation of raw scores for 


Measures 8, 9, 10, and 11. 


. Conceptual fluidity—fixity 
balance 
. Total narrator distance 


. Measure 30 minus Measure 12. 


. Summation of raw scores for 


Measures 13, 14, and 15. 


. Narrator distance (direct refer- 
ence) 

. Narrator distance (reference to 
stimulus material) 

. Reality-expanding expressions 


. Summation of raw scores for 
Measures 13 and 14. 

. Summation of raw scores 
Measures 13 and 15. 

. Summation of raw scores for 


Measures 16, 17, and 18. 


. Reality-narrowing expressions 


. Summation of raw scores for 


Measures 19 and 20. 


. Reality narrowing—expanding 
balance 
. Hypostatic reality structuration 


. Measure 36 minus Measure 35. 


. Summation of raw scores for 


Measures 21, 22, 23, and 24. 


. Hypostatic-autistatic balance in 
reality structuration 


. Measure 38 minus Measure 25. 





Fixed Design for the Formal Experiment 


A “fixed-design” experiment was conducted to pro- 
vide a formal derivation and test for the general and 
specific hypotheses listed above. It consisted of two 
phases: (a) an item analysis to derive an exact basis 
for the prediction of differences in language behavior 
between the experimental groups; and (6) a cross- 
validation study to test the occurrence of these dif- 
ferences as a function of the variable of hypostatic vs. 
autistatic conceptualization. 

The sample drawn for the fixed-design study con- 
sisted of a group of 74 college students enrolled in ele- 
mentary psychology classes at Ohio State University; 
each was placed in the object-sorting and picture- 
interpretation situations within a single experimental 
session, which averaged about an hour in duration. 
The 51 Ss who met the experimental criterion (i.e., 
those whose conceptual groupings were predominantly 
hypostatic or autistatic) were distributed into the 


following subgroups: (a) item analysis study—hypo- 
static group 12, autistatic group 11; (5) cross-validation 
study—hypostatic group 14, autistatic group 14. 

These subgroups were matched for age, educational 
level, test intelligence, and sex; an analysis of variance 
indicated no significant group differences for any of 
these control factors. 

Since the same E scored both procedures, it was 
necessary to set up a number of controls to prevent 
contaminations in prediction. These preventive steps 
(e.g., coding of names, avoidance of halo effect) are 
discussed in (7). Except where a simple frequency count 
was more appropriate (e.g., Measure 26, theme avoid- 
ance), the frequency for each of the language measures 
was converted into a ratio of the number of occurrences 
per 1,000 words of protocol. To distinguish the fre- 
quency count and the frequency ratio from the language 
scores, to be discussed below, both of the frequency 
expressions are called “raw scores.” 
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TABLE 3 
DISTRIBUTION AND SIGNIFICANCE VALUES FoR TOTAL “LANGUAGE Scorzs” In THE ITEM ANALYSIS OF 
LANGUAGE MEASURES For Hypostatic (H) anp Avtistatic (A) Groups 


(N= 


23) 








Tora, NuMBER 


E¥rectIvENEss OF CuTTiInc Pont 





PREDICTIVE CATEGORY OF 


LANGUAGE MEASURES MEASURES 


H Group | A Group 





oF LANGUAGE | 
INCLUDED | 


RIGHT | Waone | Ricet | Wronc 


| 
| x 
| 





. Total—major-predictor iS 
measures 
. Total—raw score summation | 


(Class H) measures* 


| 
| 
| 
| 


| 
| 
12 
| 


| 


il 1 11 


12 0 | 10 





* Class H measures are those used to test the general hypotheses of language behavior (cf. Table 2). 


RESULTS 


Item analysis. The raw scores for each lan- 
guage measure were placed into parallel fre- 
quency distribution tables for hypostatic and 
autistatic groups, and a cutting score for that 
measure was set at the point of greatest separa- 
tion between these groups. Each measure was 
then treated as a separate item in a language- 
behavior “test.” To accomplish this, it was 
necessary only to assign a consistent direction 
(i.e., “pass” or “‘fail’’) for each measure to the 
expected performance for one of the two 
groups of Ss. The expected performance for 
the hypostatic group was arbitrarily assigned 
in the direction of passing. Thus, an S in 
either group whose frequency count for any 
measure placed him within the range of per- 
formance considered to be characteristic of the 
hypostatic group was scored as passing that 
item. “Scores” were computed for each S 
based upon the number of passes for all of 
the items within each category of language 
measures; high scores were considered as the 
expected performance of the hypostatic group. 

Distribution and significance values for total 
“language scores” are shown in Table 3. 
“Predictive category of measures’ refers to 
the three categories in which the language 
measures were grouped according to their 
separate purposes in testing the specific and 
general hypotheses listed above. Category I, 
major-predictor measures, is comprised of 15 
separate language measures whose proportion 
of discrimination between hypostatic and 
autistatic groups equalled or exceeded 70:30 
in the item analysis; this group of measures 


constitutes the major predictor for the specific 
hypotheses of language behavior.® 

Category II, group summation measures, is 
composed of 12 of the 13 measures listed in 
Class H in Table 2; as raw score summations 
per individual S of the separate measures 
within each of the common rationale groups, 
these Category II measures, together, consti- 
tute the predictor for the general hypotheses 
of language behavior for hypostatic and 
autistatic groups of Ss. Each of the language 
measures in Category II also provided a 
proportion of discrimination between the cri- 
terion groups equalling or exceeding 70:30 
in the item analysis study.* The remaining 12 
language measures out of the 39 used were 
grouped into Category III, minor predictors; 
each of these measures provided a proportion 
of discrimination between 50:50 and 70:30 
for the criterion groups in the item analysis 
study. The language scores for the Category 
III measures are not shown for the item analy- 
sis, but were computed in the cross-validation 
study and are shown in Table 4. 

All chi-square values (for both item analysis 
and cross-validation studies) were corrected 
for continuity by Yates’s correction (8). 

While these values seem high, they, of course, 
do not constitute evidence for or against the 
hypothesis of relationship between the criterion 


* The separate language measures in this category 
are Nos. 2, 3, 4, 6, 7, 9, 12, 13, 14, 16, 19, 21, 23, 25, 
and 26. For a description of each of these measures 
see Table 2. 

* Measure No. 36 was omitted from Category II 
as the only raw score summation measure which pro- 
vided a proportion of discrimination less than 70:30 
in the item analysis study. 
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TABLE 4 


DISTRIBUTION, SIGNIFICANCE, AND CORRELATION ESTIMATE VALUES FOR ToraL “LANGUAGE SCORES” IN THE 
Cross-VALIDATION OF LANGUAGE MEASURES FOR HypostatTic AND AvTISTATIC GRrouPS 
(N = 28) 








| 
| Tora, NuMBER 


EFFECTIVENESS OF PREDICTIONS | 





PREDICTIVE CATEGORY 


or LANGUAGE 
or LANGUAGE 


MEASURES 


H Group 


A Group 





MEASURES INCLUDED 


| 
RIGHT Wronc Ricut | Wron | 





I. Total—major- 15 
predictor measures 
II. Total—raw score | 12 
summation (Class 
H) measures* 
Total—minor- 12 
predictor measures 
IV. Summation—lan- | 27 
guage scores in 
CategoriesI and IIf 
/. Summation—lan- 27 
guage scores in | 
Categories I and 
IlIt 
‘I. Summation—lan- 
guage scores in 
Categories I, II, | 
and IIIf | 


10 | 


III. 


| 
9 
| 
| 
| 


«| 2 | 2 


3 











* Class H measures are those used to test the general hypotheses of language behavior (cf. Table 2). 


t Refers to the summation per individuals of language scores in the respective categories of measures. 


and predictor measures beyond the limits of 
the item analysis sample. 

Cross-validation. In the. cross-validation 
analysis, the cutting score used for each 
language measure was the mean frequency 
for the total group of Ss in the cross-validation 
study. Thus, once the direction of the predic- 
tion for each measure had been determined, a 
simple comparison of results for each measure 
against assumed 50:50 chance distributions 
could be made. 

Table 4 shows the results of these predicticns 
of language behavior for the hypostatic and 
autistatic groups; the entries in this table cor- 
respond to those of Table 3, except for the 
addition of data for language measures in 
Predictive Categories III, IV, V, and VI. As 
discussed above, Category III is composed of 
a residue of 12 measures, out of the total of 
39, which provided relatively less discrimina- 
tion between experimental groups in the item 
analysis study. Predictive Categories IV, V, 
and VI consist of various summations of the 
language scores grouped into Categories I, II, 


and III; the intent of these summations was 
to provide an additional rough test of the 
interrelatedness of all behavior potentials 
subsumed under all of the specific and general 
hypotheses concerning language behavior. 

An examination of Table 4 indicates that 
the chi-square values obtained for the language 
measures in Category I (the major predictor) 
and for Category V (a summation of all of the 
separate measures) would have occurred by 
chance less than one time in a hundred. 
Similarly, the results for Category IV (sum- 
mation of the measures initially providing the 
greatest discrimination between criterion 
groups) and Category VI (summation of lan- 
guage scores for all measures) have p values of 
less than .05. The chi square for the results of 
the measures in Category II (raw score sum- 
mation of groups of measures with a common 
rationale) falls just below the value (3.841) 
required for a .05 level of confidence. The chi 
square for Category III (minor predictors) is 
not sufficiently high to support a generaliza- 
tion. The biserial correlations between the 
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criterion and each category of language 
measures range from .20 to .69, varying con- 
sistently with the chi-square distribution. 
These results are based upon a prediction 
of both direction and absolute distribution 
values obtained for each measure. If one 
imposes the somewhat less stringent require- 
ment to examine only for direction (i.e., for 
the total number of measures which provided 
some amount of correctly predicted discrimina- 
tion), he finds that 32 of the 39 measures were 
successful; that is, only seven measures had 
distributions opposite to that predicted, or 
provided no discrimination at all.!° The ¢ value 
for the difference between this proportion of 
.82 and a hypothetical proportion of .50 is 
4.00, which is well above the 2.714 ¢ value 
required for a p of .01. A similar result would 
be obtained for the proportion of measures 
showing a difference between means of the 
criterion groups in the predicted direction. 


DISCUSSION 


Methodological considerations. A preliminary 
study (7) of the OSS pointed to a relatively 
high consistency in the subareas of con- 
ceptualization employed by Ss in the sorting 
and resorting phases of the task. That is, 
when E instructed Ss, after the completion of 
the initial phase in the OSS, to try to resort 
the objects in a different way, they usually 
revised their groupings by forming others 
which were predominantly within the same 
conceptual area. However, these results were 
not considered as conclusive because of the 
small NV employed. 

No direct estimates of reliability for the 
language measures were attempted. However, 
there was indirect evidence of consistency not 
only from the proportionate drop in the dis- 
criminative power of each of the separate 
language measures from the item analysis to 
the cross-validation studies, but also from the 
relative chi-square values obtained for the 
several predictive categories; thus, the lower 
the discriminative power in the item analysis 
study for each predictive category of measures, 
the lower the obtained chi squares tended to 
be in the cross-validation study. 

The results obtained in the OSS did not 


1© These seven language measures were Nos. 3, 5, 
15, 16, 24, 27, and 37. 
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meet the assumptions of normal distribution 
and continuity upon which the biserial correla- 
tion technique is based, nor was the N of 
appropriate size. For a number of reasons, 
though, this technique seemed to provide a 
somewhat more appropriate estimate of degree 
of relationship than other correlation tech- 
niques which would have been operationally 
possible. However, under these circumstances 
the test of the null hypothesis was based upon 
chi-square values, and the correlation esti- 
mates are considered only roughly descriptive 
of the obtained group differences. 

Theoretical considerations. The degree of 
success in predicting differences in language 
behavior between hypostatic and autistatic 
groups suggests that “conceptual area” might 
be more useful for some purposes than “‘con- 
ceptual level,” in the sense that the latter 
term would not have supported such a predic- 
tion; that is, a classification of the object 
sortings into concrete and abstract levels 
would not have differentiated between closed- 
public and private concepts, since both types 
of concepts would have been classified in 
most cases as concrete. Further, since there 
was not a significant difference in test intel- 
ligence among hypostatic, autistatic, and 
metastatic groups in the sample used in this 
study, it might be inferred that the demon- 
strated differences in conceptual behavior 
represent something other than differences in 
conceptual levels, or abilities, among these 
groups. 

An a priori assumption was made that these 
consistent differences in conceptual behavior 
might be associated with more general behavior 
potentials—perhaps defensive in nature—of 
reality-fixing and autism. Other, more specific, 
behavioral terms were assumed to be associated 
with reality-fixing and autism and were used 
as loose guides in setting up general and specific 
hypotheses of differences in language behavior. 
The obtained results for language measures in 
Predictive Categories I and III were used to 
test the specific hypotheses; the general hy- 
potheses were tested by evaluating the results 
for the Category II measures. The summation 
of language scores in Categories IV, V, and 
VI were intended to provide an over-all test 
of interrelatedness for all of the language be- 
haviors which were measured. 

While the results seem to warrant a conclu- 
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sion of considerable interrelatedness of con- 
ceptual behaviors (including language usage) 
within the experimental groups, it is felt that 
certain limitations exist which would restrict 
the interpretation of the rationale underlying 
the language measures. 

1. The behavior potentials cited in each 
hypothesis of language behavior are not pre- 
cisely defined beyond the operations of the 
present study. 

2. Since the language measures were not 
separately validated, some of the measures 
may bear no relationship at all to the criteria; 
the probability is, however, that most of them 
do because all but seven of the measures 
showed some amount of predicted discrim- 
inative value. 

3. Each measure, as but a single hypothetical 
expression of a more general assumption of 
behavioral consistency, does not in itself 
constitute a full test of that assumption. 


SUMMARY 


Two dimensions of conceptualization, free- 
dom and extensionality, were postulated; 
from these postulations, it was hypothesized 
that the type of conceptual behavior pre- 
viously identified as concrete could be demon- 
strated to occur in two more or less mutually 
exclusive conceptual areas—hypostatic and 
autistatic. With the use of these constructs, 
two experimental groups were separated on 
the basis of consistent differences in concept 
formation in an object-sorting situation. The 
validity of this separation was tested by 
Statistically evaluating the significance of 
predicted group differences in language usage. 
Since significant differences in language be- 
havior were found to exist as predicted, it was 


195 


concluded that conceptual area might be a 
more useful concept for some purposes than 
conceptual! level in that the latter term would 
not have supported such a prediction. The 
degree of interrelatedness which was found to 
exist between conceptual and language be- 
haviors and among the language behaviors 
themselves suggested the existence of more 
general behavior potentials, which were 
termed reality-fixing and autism. 
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THE GENERALIZATION OF EXPECTANCIES'! 


RICHARD JESSOR? 
University of Colorado 


HE general purpose of the present study 
was to determine the conditions under 
which a change in expectancy in a given 
situation tends to generalize to other situations. 
The hypothesis, derived from the social learn- 
ing formulations of Rotter (10, 11), was that 
the generalization of changes in expectancy 
from one situation to other situations de- 
pended, at least in part, on the goal-relatedness 
of these situations to each other. The greater 
the commonality of goals which any two situa- 
tions have, the greater the generalization of 
changes in expectancy from one situation to 
the other. In Rotter’s system, the occurrence 
of a response in a specific situation is assun:ed 
to be a function, in part, of the expectation 
that it will be followed by or lead to the oc- 
currence of certain goals or external reinforce- 
ments. If the goals do occur or are attained 
(success), the expectancy for that response is 
increased. If the goals do not occur or are not 
attained (failure), the expectancy decreases. 
Expectancy refers to the subject’s (S’s) in- 
ternal probability that a certain event (rein- 
inforcement) will occur and has been opera- 
tionally defined in a series of studies by such 
referents as a verbal statement by S (7), and 
the observed betting behavior of S in a gam- 
bling situation (2). The similarity of these 
views to those of Tolman (13), Brunswik (1), 
and to Lewin’s concept of subjective proba- 
bility (8) may be readily seen. 
If success and failure change expectancies in 
a specific situation, the question might be 
asked whether such an effect is limited to 
that specific situation alone. We see, in every- 


! This article is based upon a portion of a disserta- 
tion submitted in partial fulfillment of the require- 
ments for the degree of Doctor of Philosophy in the 
Department of Psychology at The Ohio State University. 
The author wishes to express his indebtedness to his 
advisor, Dr. Julian B. Rotter, for his constant stimula- 
tion and helpful criticism throughout the course of the 
research. Thanks are due also to Drs. Delos D. Wickens 
and Boyd R. McCandless for their assistance during the 
investigation. A summary of this paper was read at the 
1952 meeting of the American Psychological Associa- 
tion in Washington, D. C. 

* The author was a predoctoral Fellow of the Social 
Science Research Council during the period in which 
this experiment was carried out. 


day life, examples of a person doing well or 
poorly in one situation and then expecting to 
do well or poorly in others; e.g., the college 
student who is refused a date may expect to 
be rejected in other social situations. 

To deal with such generalization phenomena, 
Rotter has relied on the concept of the func- 
tional relatedness* of behaviors or responses. 
Responses are considered to be functionally re- 
lated when they have led in the past to the 
same or similar goals or reinforcements. This 
functional relatedness of responses is presumed 
to mediate the generalization of changes in ex- 
pectancies when any given response is rein- 
forced. A parallel concept in a nonexpectancy 
theory is, of course, Hull’s (6) habit-family 
hierarchy wherein the reinforcement of one 
member of the family is shared by the other 
members. 

The problem investigated was the influence 
of success and failure {positive and negative 
reinforcement) upon expectancies in the same 
situation and in related situations. The results 
to be presented will deal with the change in ex- 
pectancies in related situations after success or 
failure in the original situation. On the basis 
of the foregoing theoretical considerations, the 
hypothesis was formulated that the degree of 
generalization of changes in expectancy would 
vary with the degree of goal-relatedness of the 
subsequent situations to the original reinforced 
situation. 

METHOD 

Subjects. A total of 132 male Ss, selected from intro- 
ductory classes in psychology, speech, and English, 
performed the experiment. The majority of the students 
were freshmen. Seven Ss were discarded on the basis 
of criteria which preceded the introduction of the in- 
dependent variable. Only males were used because one 
of the tasks in the experiment involved physical skills, 
and it was felt that success or failure in this area would 
be rather a peripheral experience for most female 
college students. 





3 The nature of the responses which are functionally 
related for any individual depends upon his past learn- 
ings and is not intrinsic to the responses themselves. 
While ultimately it is necessary to study functionality 
for each individual, the problem may be approached 
at this stage by assuming, for certain groups of Ss, 
similar functionality based upon some degree of uni- 
formity of social learning. 
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Tasks. The selection of four differentially goal- 
related tasks constituted one of the problems of the 
research. For experimental purposes it was decided to 
deal with subgoals rather than higher-order goals or 
needs. The following subgoals were employed: recog- 
nition for academic skill, recognition for physical skill, 
and love and affection from the opposite sex. They were 
chosen because of their meaningfulness to a male col- 
lege population and to enable comparison with a pre- 
vious study by Crandall (4). 

With the above goals in mind, four tasks were se- 
lected, each of which might be expected to involve 
primarily, for the groups being used, one of these goals. 
In addition, special instructions were utilized in de- 
scribing each task to structure it around the specific 
subgoal desired. The first task was a series of eight 
verbal arithmetic problems typed on a sheet of paper. 
The problems were of increasing difficulty and required 
only arithmetical computation. It was felt that this 
task, by its learned cultural implications for an aca- 
demic group, and by virtue of the instructions accom- 
panying it, e.g., “this task predicts success in college,” 
involved the goal of recognition for academic skill. The 
arithmetic task was used as the original task for all Ss 
and is the task on which success or failure was manipu- 
lated. 

The three remaining tasks constitute the generaliza- 
tion tasks. Of the three tasks, one was intended to in- 
volve the same subgoal as the reinforced arithmetic 
task. This task was described as a measure of vocabu- 
lary knowledge and consisted of a deck of cards with 
anagrams of increasing difficulty on them. Since it is 
of a traditional academic nature, and by virtue of in- 
structions, e.g., “this task measures your general 
knowledge,” it was presumed that this task would also 
involve the need for recognition for academic skill and, 
thereby, be most goal-related to the arithmetic task. 

The second generalization task consisted of an epi- 
cyclic pursuit roter whose speed could be controlled. 
This task, by requiring manual participation and by 
use of instructions, e.g., “this is a test of physical co- 
ordination and athletic skill,” was presumed to involve 
the goal of recognition for physical skill. Since this 
task involved the goal of recognition, it was considered 
goal-related to the arithmetic task; however, the fact 
that it involved recognition for physical skill made it 
less goal-related to the arithmetic task than was the 
vocabulary task. 

The third generalization task was structured as a 
test of social attractiveness and likeabiiity to the op- 
posite sex. It was described to S as a five-minute inter- 
view with “one of the girls on our staff” in an adjoining 
room. The S was told he would be rated by the girl 
on such things as warmth, friendliness, adaptability 
with the opposite sex, etc. This task was presumed to 
involve the goal of love and affection from the opposite 
sex. Since this task was selected and structured not to 
involve, or, at least, minimally to involve, the goal of 
recognition, it was considered least goal-related to the 
arithmetic task. 

In view of the foregoing considerations concerning 
the goal characteristics of the tasks, it was predicted 
that reinforcement on the arithmetic task should result 
in expectancy changes varying from most to least in the 
following order: arithmetic task, vocabulary task, pur- 
suit rotor, and social attractiveness task. 

Expectancy measure. The level-of-aspiration para- 
digm was adopted in this experiment. The aim of the 
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study was described to S as an attempt to find out how 
accurately people can predict their performance in 
a variety of situations. The four tasks were described 
in detail, including the information that possible scores 
on each task ranged from 0 to 50. The S was asked to 
predict, as accurately as possible, the score he would 
get on each task and was told that he would have an 
opportunity to make new predictions for all tasks when- 
ever he finished one of them. 

The S first performed the arithmetic task; he was 
given a predetermined score‘ which was interpreted by 
means of a table of false norms, and was told, “Well, 
let’s pause for a moment and go throughand make these 
predictions again for each task. You predicted —— 
last time. You got ——. What do you predict now?” 
This question was asked for each task in the order 
arithmetic, pursuit rotor, vocabulary, and social task. 

At this phase all the relevant data for the experi- 
ment were completed, that is, predictions of per- 
formance scores on four tasks before and after re- 
inforcement on one of them, the arithmetic task. It is 
necessary at this point to indicate the relationship of 
such data to the purpose of the experiment. 

The S’s answer to the request that he predict his 
score is the referent for the construct of expectancy. 
It is, however, an indirect and relative measure of that 
construct, i.e., it does not provide us with a probability 
value. Instead, the score which S predicts out of a range 
of 0 to 50, while under instructions to be accurate, may 
be considered the score in that range which has, for 
S, the highest subjective probability of occurrence. 
While we do not know the absolute value of the ex- 
pectancy, we know it is higher than the expectancy for 
any other score. After success or failure on the arith- 
metic task, the subject may predict a different score 
for some of the tasks. We assume from this that this 
new score now has the highest expectancy of occur- 
rence. The prediction of a different score than the first 
score predicted, after success or failure, is the referent 
for the measure of expectancy chenge. It is the generali- 
zation of these changes which is the problem under in- 
vestigation. 

General procedure. Each S was seen individually. 
The S was seated at a table on which were the pursuit 
rotor, two decks of cards containing the anagrams, and 
two sheets of paper containing arithmetic problems. 
The E sat at another table, part of which was shielded 
from S’s vision. After random assignment of S to one 
of the experimental groups, E explained the procedure 
and read the instructions. After stating an expectancy 
score for each of the four tasks, Ss performed the arith- 





‘The S was also asked at the outset what score was 
the lowest one he would still be satisfied in getting. 
This score was utilized by Z as the basis for manipulat- 
ing success and failure and is called the minimal goal 
score. Depending on what reinforcement group he was 
assigned to, S was told a false performance score which 
was a specified number of points above or below his 
minimal goal. In the experiment there were a success 
group, a weak failure group, and a strong failure group, 
all defined in terms of the direction and size of the dis- 
crepancy of the reported performance score from the 
minimal goal score. This procedure was an attempt to 
assure that success or failure was defined within S’s 
own framework rather than arbitrarily. For purposes 
of the hypothesis dealt with in this article, these groups 
are combined and treated as a whole. 
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metic task and received predetermined scores accord- 
ing to the group to which they were assigned. Scores 
were manipulated by false timing and the use of com- 
plex looking tables of norms. The Ss then gave the 
second set of expectancy scores. This completed the 
experimental aspect of the situation. All Ss then per- 
formed either the pursuit rotor or vocabulary task or 
both and were allowed to succeed on these in order to 
overcome the effects of the prior arithmetic failure. 
This did not, however, constitute part of the experi- 
mental procedure. That the failure conditions were 
meaningful was quite apparent from the distress evi- 
denced by most Ss. 

After this experience of success, S was told that the 
half hour was up, and, since another 5 was already 
waiting, it would be necessary to omit the social at- 
tractiveness task. The S was asked not to discuss the 
experiment with anyone, and no S admitted to prior 
knowledge of the experiment when questioned. 


RESULTS 


The analysis of the data was made in terms 
of number of Ss, out of the total group of 125, 
who changed their expectations on a given 
task. The chi-square test for correlated pro- 
portions (9, p. 206) was used to test the sig- 
nificance of the differences between propor- 
tions thai changed on the generalization tasks. 

That the reinforcement proced:.-e was ef- 
fective is evidenced by the fact that on the 
directly reinforced task, the arithmetic task, 
88.8 per cent of the total group changed their 
expectancies after success or failure. The main 
concern of the present report, however, is 
whether or not expectancy changes generalize, 
and the nature of this generalization. 

The data presented in Table 1 indicate quite 
clearly that generalization has taken place in 
the experiment, that is, that the effect of suc- 
cess or failure on the arithmetic task was not 
confined to that task alone. It may be seen 
from Table 1 that, for all three of the general- 
ization tasks, some Ss changed expectancies. 
This finding provides the necessary basis for 
the analysis required to test our hypothesis. 

According to our hypothesis, generalization- 


TABLE 1 


PROPORTION OF ToTaL Group (N = 125) Wao 
CHANGED EXPECTANCIES ON THE THREE 
GENERALIZATION TASKS 





PROPORTION 
Wuo CHANGED 
EXPECTANCY 


448 
-360 
248 


Vocabulary 
Pursuit rotor 
Social attractiveness 
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TABLE 2 
DIPFERENCES AMONG GENERALIZATION TASKS 
Proportion OF TotraL Group (N = 125) Wao 
CHANGED EXPECTAN 





VOCABULARY 


Social attractive- .200 
ness (<.001)* 
Pursuit rotor .088 


(<.02)* 


* Values in parentheses represent probability values for the 
chi-square test of significance of differences between correlated 


proportions. 


of-expectancy changes should be mediated by 
the goal-relatedness of the nonreinforced tasks 
to the reinforced task. Our analysis of the goal 
qualities or characteristics of the generaliza- 
tion tasks used in this experiment indicated 
that the vocabulary task was most goal-related 
to the arithmetic task, the pursuit rotor next, 
and the social attractiveness task least. We 
should therefore expect to find that the pro- 
portion of Ss who change expectancies de- 
creases in the order vocabulary, pursuit rotor, 
and social task. Table 1 confirms our predic- 
tions, the respective proportions being .448, 
.360, and .248. To determine whether these 
proportions were significantly different from 
each other, the analysis presented in Table 2 
was made. 

Table 2 indicates that the differences among 
the tasks are significant in the predicted direc- 
tion. Some question might be raised concerning 
our acceptance of a p of .13 between the vo- 
cabulary and the pursuit rotor tasks as sig- 
nificant. This value as it stands is not too 
discrepant from the generally accepted .05 
level to indicate a trend toward significance. 
In addition, however, a further analysis of the 
data, excluding the 15 Ss making up the suc- 
cess group, yielded a difference between the 
vocabulary and the pursuit rotor tasks which 
was significant at the .05 level. The significance 
of the differences among the other tasks, in 
this analysis, remained substantially the same 
as given in Table 2. The reason for this analysis, 
excluding the weak success group, was due to 
certain methodological problems which arose 
during the experiment. Success seemed to have 
less face validity for Ss than failure did for Ss 
in the failure groups. For these reasons the 
data for the success group were not considered 
as valid as the data for the failure conditions. 
The relationships found for changes in expect- 
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ancies are very similar, however, including or 
excluding the success group. 


DISCUSSION 


The major hypothesis, that it is possible 
to predict generalization along a dimension 
of functional relatedness of behavior or learned- 
goal similarity, derives support from the 
experimental findings. This may be considered 
one of the important dimensions for which 
Tolman (12) emphasizes a need in psychology. 
At this point it becomes necessary to consider 
several points which indicate the rudimentary 
nature of our findings and the importance of 
further research. 

Although the utility of a dimension of 
similarity of goals in predicting generalization 
seems to have been demonstrated, it must be 
emphasized that we have no knowledge what- 
soever of the nature of the units along that 
dimension. The research has ordered three 
tasks along this dimension, but only relative 
to each other and to the arithmetic task. Until 
some sort of scaling procedure is developed 
whereby we can predict upon the basis of scale 
units rather than relative position, it will not 
be possible to describe in mathematical terms 
the nature of the generalization function. 

The writer is aware of the fact that the goal 
characteristics of the generalization tasks were 
not independently measured. The question 
may be raised, e.g., “how do you know the 
vocabulary task really involved the need for 
recognition for academic skill?” To some extent 
this question is not meaningful. A system of 
personality adopts certain constructs on the 
basis of their utility in prediction. If the need 
or goal constructs, as defined and applied in 
this study, prove to be predictive, that seems 
the only necessary criterion for their utility. 
To ask whether these needs really were present 
is partly to lose sight of the nature of the con- 
structs. The aspect of the question which would 
be meaningful is whether this goal description 
of each task was suitable for all Ss. On this 
point, it would be necessary te agree that our 
procedure assumed that Ss have a certain 
homogeneity of social learning experiences, an 
assumption which, of course, would not be 
strongly defended. For preliminary research 
relying on group differences it was apparently 
useful. Future research could well refine this 
aspect, by using Ss with known social learning 
backgrounds for whom the goals involved in 
any situation would be known with a high de- 
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gree of accuracy, instead of relying upon gross 
cultural uniformities. 

A further point related to the above is 
whether stimulus similarity was fully controlled 
in the experiment. It is not possible to affirm 
this, and it may be that some of the tasks were 
more similar to the arithmetic task on a 
physical basis than others were. If this were 
so, it would enable an explanation of the ex- 
perimental findings on the basis of stimulus 
generalization. Such an explanation is felt to 
be untenable in light of the conditions of the 
experiment, and the writer cannot conceive 
that the predictions made could also have 
been made solely on the basis of stimulus 
generalization. [n addition, a study done sub- 
sequent to the present one, but with a similar 
hypothesis, controlled the factor of stimulus 
similarity. Chance (3), in her study, used 
identical task situations but varied their goal 
character by instructions and attained results 
in agreement with those of the present experi- 
ment. 

A final limitation of the design needs to be 
noted. The order in which all Ss were asked 
to state their expectancies and minimal goals 
was not rotated among the tasks but always 
began with the arithmetic task and followed 
with the pursuit rotor, vocabulary, and social 
attractiveness tasks. If order alone, however, 
were determinative of the degree of generaliza- 
tion, then the fact that more generalization 
occurred to the vocabulary task than to the 
pursuit rotor would be inexplicable. Order, per 
se, cannot account for our findings. 

The results of the present study are related 
to two others. Besides Chance’s corroborative 
study mentioned above, the research of 
Crandall (4), which preceded the present 
experiment, approached the major hypothesis 
in a different manner. Crandall utilized TAT- 
type pictures selected on the basis of their 
need-related stimulus value. The pictures 
dealt with the areas of recognition for physical 
skill, recognition for academic skill, and af- 
fection from opposite sex peers. The Ss told 
stories to the pictures before and after failure 
in a physical skills situation. A control group 
had a rest pause between sessions. His findings, 
using ratings of the stories (5), showed the 
greatest change on the physical skills pictures, 
next on the academic skills pictures, and least 
on the affection pictures. While his results were 
in the predicted direction, the differences in 
amount of change between the three picture 
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areas were not statistically significant. The 
present study, utilizing a different methodology, 
deals with the same problem investigated by 
Crandall. 

The implications of the present problem for 
theory and practice seem apparent. Where 
the construct of expectancy is central to a 
personality theory, an understanding of the 
conditions under which expectancies change 
becomes of crucial importance. The knowledge 
that expectancy changes generalize, at least 
in part, according to the goal-relatedness of 
situations would seem to be of importance to 
the clinician as well as the theorist. 


SUMMARY 


The present study was an attempt to test 
a hypothesis derived from Rotter’s formula- 
tions of a social learning theory of personality. 
Within that theory, the occurrence of a re- 
sponse is considered to be in part a function 
of the expectancy that the response will be 
followed by or lead to a specific goal or rein- 
forcement. Expectancies are presumed to 
change on the basis of two factors: (4) whether 
the specific response is or is not followed by 
the reinforcement, and (6) by generalization 
from changes in the expectancies of function- 
ally related behaviors or responses. The con- 
cept of functional relatedness implies that the 
responses lead to the same or similar goals. 
Hence, stated another way, generalization of 
expectancy changes is presumed to occur along 
a dimension of learned-goal similarity. 

To test this latter proposition, the level-of- 
aspiration paradigm was utilized. One hundred 
twenty-five Ss made predictions of their per- 
formance on four tasks before and after failure 
or success on one of them, the arithmetic task. 
Change in expectation statements after success 
or failure was the indirect and relative measure 
of expectancy change. The four tasks were 
selected on the basis of their presumed goal 
potential for the subculture of college males. 
In addition, instructions were used to delineate 
further the primary goal involved in each task. 
The four tasks were an arithmetic test, a 
vocabulary test, a pursuit rotor task, and a 
social interview with 2 member of the opposite 
sex. These tasks were presumed to involve, 
respectively, the goal of recognition for aca- 
demic skill, the goal of recognition for academic 
skill also, the goal of recognition for physical 
skill, and the goal of love and affection from 
the opposite sex. It was predicted that changes 
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in expectancies after success or failure on the 
arithmetic task would generalize to the other 
three tasks on the basis of their goal-related- 
ness to it. Since goal-relatedness to the arith- 
metic task varied in the order in which the 
tasks are listed above, it was predicted that 
the degree of generalization would vary in the 
same decreasing order. 

The results were dealt with in terms of pro- 
portion of Ss who changed their expectancies 
on the three generaliza.:.n tasks. The data 
corroborated the hypothesis and indicated that 
the three tasks were significantly different from 
each other in relative position along the dimen- 
sion of goal similarity. The findings were dis- 
cussed in terms of limitations and areas of 
further research. 
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EFFECTS OF DECISION MAKING BY GROUP MEMBERS ON RECALL 
OF FINISHED AND UNFINISHED TASKS! 


MURRAY HORWITZ AND FRANCIS J. LEE 
University of Iilinois 


URRENT theories of motivation have 
developed mainly from studies which 
have treated the individual in isolation 

from a social group. In these studies, as the 
goal-striving individual moves toward goals 
or away from avoidances, his environment 
serves as a fixed frame of reference against 
which changes of position are measured. It is 
possible, however, to conceive of a fixed en- 
vironment as only the limiting case of the 
more general condition in which, to a greater 
or lesser degree, positional relationships within 
the environment are changing. Such a con- 
ception of a changing or “‘active” environment 
seems to be necessary if we consider an indi- 
vidual’s situation as a member of a social 
group. By virtue of action undertaken by the 
group, the individual may find himself in a 
new position relative to his goals or avoidances 
even though he has himself been completely 
quiescent. Thus the individual as a group mem- 
ber may find himself moved toward or away 
from either goals or avoidances by “the ground 
moving under his feet.” If he is located in an 
environment with these escalator-like prop- 
erties, his psychological situation can be said 
to consist on the one hand of positions through 
which he can locomote, as is assumed in 
studies of isolated individuals (11), and on the 
other hand of positions through which he will 
to some degree be “carried” by the group. 
The study reported here is the second in 
which we are investigating the consequences 
for motivational theory of examining the 
individual in an active social environment. 
The problem dealt with in this paper grows 
out of findings of a previous experiment (9) 
in which one of the authors investigated the 
effects upon a person’s need tensions of being 
“carried” toward or away from goals or 


1 This study was carried out at the Bureau of Edu- 
cational Research, College of Education, University of 
Illinois, and supported in part by a research grant 
from the Human Relations Branch, Office of Naval 
Research, U. S. Navy (Task Ne. NR 172-201). The 
authors are indebted to Williarn M. Peterson for 
assistance in the conduct and analysis of the experi- 
ment. 
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avoidances by group action. In this earlier 
investigation, individuals working in a group, 
but voting independently of one another, were 
asked to indicate whether they desired to 
finish or to avoid finishing each of a series of 
puzzles which the group had begun. In some 
cases where subjects (Ss) voted to finish work 
on a puzzle, the group completed this task, 
while in other cases the work was left un- 
finished. Similarly, where an S voted to avoid 
work on the puzzle, he might find the group 
acting in line with his vote or in opposition 
to it. The existence of tension systems was 
measured by Zeigarnik’s (19) method, which 
assumes that level of recall of tasks is a direct 
function of the magnitude of unreduced tension 
associated with the tasks. According to 
Zeigarnik’s assumptions, tension systems are 
unreduced for unfinished tasks but are reduced 
for finished ones, and recall of unfinished tasks 
should therefore be greater than recall of 
finished tasks. In the case of puzzles which Ss 
had voted to finish, a comparison of levels of 
recall for actually finished and unfinished 
puzzles indicated that tension systems could 
be regarded as reduced when S was “carried” 
by the group into a situation in which the 
group task was completed, but that tension 
systems were not reduced when the group 
“carried” him only part way toward comple- 
tion. However, if the S had voted to avoid the 
puzzle, tension systems corresponding to 
avoidance were reduced where the group’s 
action satisfied S$ by removing any possi- 
bility that he would be “carried” into the 
region of avoidance; tension systems were not 
reduced where the group completed the puzzle, 
“carrying” him against his desire into the 
situation he wished to avoid. 

It was apparent from these results that one 
could extend the theory of tension systems (13) 
in a fairly direct way to the active environ- 
ment provided by a social group. This study 
produced one unexpected result, however, 
which pointed to a peculiar characteristic of 
decision making when viewed in the context 
of this type of environment, and which sug- 
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gested moreover that decision making has 
direct effects upon the operation of tension 
systems. Certain Ss found that in the course 
of voting whether or not to work on the 
puzzles their votes differed frequently from the 
announced majority decision. On the basis of 
qualitative observations, it appeared that 
under these conditions Ss lost interest in 
completion or noncompletion of puzzles, and 
instead seemed to define a new task for them- 
selves, namely, to vote so as to agree with the 
majority of the group. The data revealed that 
these Ss shifted the basis on which puzzles 
were recalled, recalling a predominance of 
puzzles for which their votes had agreed with 
the group. But this is an odd result in terms of 
tension-system theory, for, given the task of 
agreeing with the group, agreements can be 
regarded as finished tasks and disagreements 
as unfinished ones. By recalling more agree- 
ments, tiiese Ss were apparently reversing the 
Zeigarnik effect. It was further observed that 
in attempting to agree with the anticipated 
group vote, the decision about how to vote 
had now become a difficult problem for these 
Ss. To interpret the paradoxical finding con- 
cerning the Zeigarnik reversal, hypotheses were 
suggested about possible relationships between 
decision making, tension systems, and recall. 

In studying decisions of an isolated indi- 
vidual, it can be legitimately assumed that if 
an individual pursues one of two alternative 
paths of action, he has made a decision be- 
tween them. However, the peculiar character- 
istic of decision making in an active social 
environment is that just as a person may be 
“carried” toward or away from goals or avoid- 
ances, he may be “carried” by the group past 
the decision point before he has been able to 
make a choice be .ween alternatives. In a group 
situation, that is, he may find that he has not 
been given an opportunity to “make up his 
mind” and must act while still in a state of 
indecision. To account for the Zeigarnik re- 
versal it was hypothesized that tension sys- 
tems will affect recall differently in states of 
indecision and decision: if an individual is in a 
state of indecision, then tension systems will 
tend to be expressed in a wish-fulfilling manner, 
resulting in greater relative recall of finished 
tasks; if he is in a state of decision, then tension 
systems will tend to be expressed in terms of 
goal-directed activity, hence in greater relative 
recall of unfinished tasks. In the next section 
we attempt to show how this hypothesis can be 
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derived within Lewin’s theory of motivation, 
and we describe an experiment designed to 
test this theoretical interpretation of the 
previously obtained results. 


DERIVATION OF THE HYPOTHESIS 


For the sake of brevity, we shall indicate 
the general form of derivation of the hypoth- 
esis rather than attempt a formal derivation. 
The form of the argument, using concepts 
within Lewin’s theory of motivation, is as 
follows: 

Assumption 1. An individual will manifest 
tendencies toward wish fulfillment (opera- 
tionally: tend to recall finished tasks) in 
psychological situations of high fluidity; an 
individual will manifest tendencies toward 
action (operationally: tend to recall unfinished 
tasks) in psychological situations of low 
fluidity. 

Assumption 2. A state of indecision (“‘con- 
siderations”) corresponds to a psychological 
situation of high fluidity; a state of decision 
(“readiness for action”) corresponds to a 
psychological situation of low fluidity. 

Hypothesis. An individual will tend to recall 
finished tasks if he is in a state of indecision; 
he will tend to recall unfinished tasks if he is in 
a state of decision. 

The concepts employed above and the ra- 
tionale for the assumptions made require some 
discussion. Within Lewin’s theory, the struc- 
ture of the psychological situation is treated 
in terms of an individual’s view of positional 
relationships between means and goals (11). 
Lewin further characterizes psychological sit- 
uations in terms of their fluidity, or degree of 
ease with which restructuring can occur. An 
individual who is engaging in fantasy (level of 
irreality) is in a situation of high fluidity so 
that a relatively weak force, such as a simple 
suggestion, may induce him to view himself as 
having attained his goal, though objectively 
the goal may be quite distant. However, while 
he is engaging in everyday business (level 
of reality), the degree of fluidity of the psy- 
chological situation is low. His view of the 
steps he must take to reach his goal will tend 
to correspond more closely with objective facts, 
and he will usually require stronger forces in 
the form of evidence, persuasive arguments, 
etc. to modify his views. 

In terms of this concept of degree of fluidity, 
we suggest the following rationale for assump- 
tion 2. The existence of a need tension within 
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the individual implies that tendencies will be 
aroused to reduce the psychological distance 
between his present position and his goal. 
Given a highly fluid cognitive structure, these 
tendencies are assumed to induce a restructur- 
ing of cognitions in the direction of wish 
fulfillment; that is, rather than acting toward 
the goal, the person tends to alter his view of 
his situation so that he sees himself as having 
attained the goal. Since need tensions are not, 
or only slightly, reduced by such irreal be- 
havior (7), they continue to operate, and the 
person in this situation should continue to 
think about goal attainments. It follows that 
if tension systems exist related to performing a 
series of tasks, a person behaving on the level 
of irreality should recall more finished ones, 
i.e., attainments. (This process seems to cor- 
respond with Rapaport’s [17] description of the 
primary process in psychoanalytic theory, in 
which drives operate to cathect the memory of 
past gratifications.) On the level of reality, 
however, the cognitive structure is less fluid 
and resists such restructuring. In order to 
reduce his psychological distance from the 
goal, the individual is now required to loco- 
mote, i.e., engage in action to attain the goal. 
Ovsiankina (16) has shown that where a sub- 
ject is locomoting toward a goal, interruption 
of the task leaves him with tendencies to 
resume locomotion. Thus, where thinking is 
related to action, he will tend to continue to 
think about the task which is to be resumed, 
and if tension systems related to a series of 
tasks exist, a person behaving on the level of 
reality should recall more unfinished ones. 
(This appears to correspond with the psycho- 
analytic secondary process in which ideas 
serve, not as cathexes of the memory of past 
gratifications, but to “enhance the chances... 
of discovering the object in reality—or so 
change reality that the object becomes avail- 
able” [17, p. 74].) 

We propose, in assumption 2, that states of 
indecision correspond to being in a fluid 
psychological situation, and that states of 
decision correspond to being in a nonfluid 
situation. Here we follow the conceptual 
treatment of decision making proposed by 
Cartwright and Festinger (3) and Lewin (12). 
The individual in a condition of indecision is 
in a state of “considerations,” that is, he 
views himself as first in one, then in another, 
of two overlapping situations. His situation is 
fluid because in a state of considerations, rela- 
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tively small forces can induce him to “change 
his mind” about his distance from the goal; 
that is, we assume that, with respect to degree 
of fluidity, a state of considerations is closer 
to fantasy than to action. When the individual 
makes a decision, he eliminates all but one 
alternative, and is no longer fluctuating be- 
tween overlapping possibilities. He moves from 
a psychological situation of considerations to a 
situation of readiness for action. The assump- 
tion involved is that making a decision implies 
a “freezing” of the previously fluid cognitive 
structure. From the preceding assumption 
that need tensions produce wish-fulfillment 
tendencies in fluid situations and action tend- 
encies in nonfluid situations, our hypothesis 
follows, namely, that finished tasks will tend 
to be recalled while an individual is in a state 
of indecision and that unfinished tasks will 
tend to be recalled while he is in a state of 
decision. 

In the next section we describe the experi- 
ment designed to test this hypothesis. The 
hypothesis, as we noted in the introduction, 
was suggested by a proposed interpretation of 
an earlier unexpected finding that in a situa- 
tion where Ss voted for or against performing a 
task, recall of tasks by certain “deviant” Ss 
was mainly influenced by whether or not their 
votes on given tasks agreed with the majority 
vote. This finding was attributed to the occur- 
rence of uncontrolled and only qualitatively 
observed variations in the experiment, namely, 
that these Ss had subjectively redefined the 
task to that of being in agreement with the 
group, and that as a result they were fre- 
quently in a state of indecision. In the experi- 
ment reported here, the task explicitly as- 
signed to the Ss is the one which was presumed 
to have been established by these “‘deviant”’ Ss. 
This procedure has the advantage of possible 
gains in information by testing our present 
hypothesis while simultaneously checking the 
proposed interpretation of the earlier finding. 


METHOD? 


Subjects. The Ss were female undergraduates at the 
University of Illinois, recruited in five-member teams 





2 To condense this article, copies of the verbatim 
instructions, the various forms used in the experiment, 
and the original tabulations of data have been deposited 
with the American Documentation Institute. Order 
Document No. 4059 from ADI Auxiliary Publications 
Project, Photoduplication Service, Library of Congress, 
Washington 25, D. C., remitting in advance $2.00 for 
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to represent their sororities in a test which was de- 
scribed to them as measuring group sensitivity. These 
were eight groups, each meeting for one two-hour 
session. Although a total of 40 Ss participated, the 
procedure required that one member of each of the 
eight groups receive a different experimental treatment 
from the others. Data obtained from these eight Ss are 
not reported in the present results, which are based on 
the remaining 32 Ss. 

Apparatus. The physical setup was designed to 
permit Ss to work together in assembling a series of 
jigsaw figures, while allowing only a minimum of com- 
munication among them. The Ss were seated around a 
circular table approximately 5 ft. in diameter. The 
jigsaw figures were placed within a smaller concentric 
circle approximately 3 ft. in diameter. Vertical shields 
(plywood panels 22 in. high and 18 in. long) extended 
from this center circle to 6 in. beyond the outer edge 
of the table. These shields were placed so as to divide 
the table into six segments or “booths,” one for each 
of the Ss and one for the experimenter (Z). In addition, 
window shades were mounted at the end of each booth, 
and drawn to a level where Ss, while sitting in their 
booths, had a clear view of the jigsaw puzzles placed 

in the center work space, but could not see the occu- 

pants of booths opposite their own. 

Assembling the jigsaw puzzles was entirely a manipu- 

lative task. The jigsaw puzzles were 16 simple figures 
of such objects as “telephone,” “shoe,” etc., Outlines 
of figures were drawn on white cardboard (1344 in. 
X 13% in.), and each figure was divided into five sec- 

tions. These outlines could be filled in with pieces of 
colored poster board cut to match the five sections of 
the outline. Pieces were distributed by E around the 

outline, and Ss moved them into place using rods (30 in. 

long) mounted on swivels in each booth. Attached to 

each of these rods was a small metal tip which enabled 

Ss to lift or push the jigsaw pieces into their appropriate 

places on the outline of the figure. 

In order that Ss could register votes about whether 

or not they desired to complete a given puzzle, two- 

way toggle switches were installed in the booths. These 

switches were wired to a panel of lights in E’s booth so 

that each S could activate a green or red light to indi- 

cate a “‘yes” or “no” vote, respectively. 

Procedure. The major features of the experimental 

procedure can be summarized as follows: 

1. To enhance Ss’ receptivity to the experimental 

goal of being in agreement with the group, two pre- 

experimental devices were used: (a) a 15-minute group 

discussion conceraing the effects of sorority life on 

improving Ss’ “sensitivity to others’ feelings”; (5) a 

so-called “‘Test on Group Membership” in which Ss 

could—and did—personally commit themselves to the 

importance of being in agreement with their group. 

2. The rules for assembling the jigsaw puzzle re- 

quired that Ss work together to place three of the five 

pieces in place within the puzzle outline, and then vote 

in private whether to stop at that point or continue 
assembling the remaining two pieces. 

3. The group was informed that their score for 
“sensitivity” would be determined by the level of 
agreement attained in the vote—a unanimous vote 
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being scored 100 points; agreement among four out of 
five, 50 points; a simple majority, 30 points. 

4. Each S was supplied with 20 voting tabs, half 
of which were marked “yes,” and half “no.” Before 
voting each S was required to give up a tab corre- 
sponding to her intended vote, thus insuring that both 
“yes” and “no” votes would be distributed over the 
16 puzzles. 

5. The Ss were provided a supply of “message 
forms” on which they could indicate how they intended 
to vote ad their reasons for so voting. In order to 
engender additional feelings of responsibility in the 
voting, each of four Ss was privately informed that 
she had major influence over the outcome, since her 
message—and only hers—would be delivered to another 
S. Each time message forms were collected, one form 
was shown to the fifth S, thereby leading each of the 
others to believe that her own message had been 
delivered. 

6. After messages and voting tabs were turned in, 
Ss registered their votes using the toggle switches at 
their work places. The EZ recorded their separate votes, 
and then announced to the whole group at once the 
results of the voting. Announcements of the vote fol- 
lowed a prearranged experimental plan, and did not 
necessarily correspond with the actual vote. Votes on 
the first two puzzles were announced as “yes,” since 
announcement of “no” votes so early in the series 
might have been interpreted by Ss as meaning that the 
group was rejecting the experimental situation. Subse- 
quent announcements were varied so that each S 
would find her own vote about equally often in agree- 
ment or disagreement with the group vote. 

7. After an announced “yes” vote, the group as- 
sembled the two remaining pieces of the puzzle before 
starting a new one. After a “no” announcement, the 
group immediately began to work a new puzzle. 

8. In order to avoid any suggestion that Ss were 
being given a memory test (13), EZ, after the above 
cycle bad continued for 16 puzzles, stated that he 
wished to get Ss’ reactions to the puzzles up to that 
point, and asked, “Would you jot down the names of 
any of the figures which come to your minds, in order 
to have something specific to refer to when you give 
your reactions.” The Ss were allowed a minute and a 
haif for listing names of puzzles, after which the lists 
were collected. A questionnaire entitled, “Your Reac- 
tions to This Test,” designed to check Ss’ under- 
standing and acceptance of the experimental instruc- 
tions, was then distributed. 

Measures used. (ne of the items on the message 
forms read: 

“As far as I can figure out the basis for the group’s 
voting, the group decision: 

(check a———Can turn out either way 

one) &———Will probably turn out to be ‘yes’ 
¢ Will probably turn out to be ‘no.’ ” 

Since Ss’ goal was to agree with the group, those who 
checked a can be regarded as in a state of indecision 
concerning their vote; those who checked 6 or ¢ can 
be regarded as in a state of decision. For a given S, 
puzzles associated with these respective states can be 
classified as either “decision” puzzles or “indecision” 
puzzles. 

Puzzles can be further classified as ones on which 
given Ss had been in agreement or disagreement with 
the announced group vote, hereafter designated as 
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“agreement” and “disagreement” puzzles. The tend- 
ency to recall finished tasks is measured by relative 
recall of agreement puzzles, and tendency to recall 
unfinished tasks is measured by relative recall of dis- 
agreement puzzles. To avoid misunderstanding, it 
should be noted that we depart here from the more 
familiar practice in which unfinished tasks are coordi- 
nated to interrupted puzzles while finished tasks are 
coordinated to completed puzzles. In terms of the goal 
defined for Ss in the present situation, we assume that 
the task on a given puzzle is finished where an S’s 
vote agrees with the announced group vote. The task 
is assumed to remain unfinished where S’s vote dis- 
agrees with the announced group vote. 


RESULTS 


The hypothesis being tested can be divided 
into two subhypotheses, dealing with the 
decision and indecision conditions, respec- 
tively: (a) recall of decision-disagreement 
puzzles will exceed recall of decision-agree- 
ment puzzles (Zeigarnik effect); (6) recall of 
indecision-agreement puzzles will exceed recall 
of indecision-disagreement puzzles (Zeigarnik 
reversal). 

Considering the decision condition first, we 
can compare the proportions of disagreement 
and agreement puzzles recalled by each S, and 
determine how many Ss did and did not recall 
relatively more disagreement puzzles. The null 
hypothesis is equivalent to the statement 
that the occurrence of an S whose recall con- 
forms with the hypothesis is equiprobable with 
the occurrence of an S whose recall does not 
conform. This can be tested by the binomial 
expansion with p = .5, using a one-tailed test 
since direction is predicted. Thirty-one Ss can 
be compared; for one S the comparison is 
impossible since she never found herself in 
disagreement with the group while in an 
indecision condition. Twenty Ss (64.5 per cent) 
recalled relatively more disagreements than 
agreements in conformity with the hypoth- 
esis; 11 Ss (35.5 per cent) did not recall 
more disagreements than agreements. On the 
assumption made in the null hypothesis, the 
probability of obtaining this distribution by 
chance is .075 (20). This is a somewhat con- 
servative test of the hypothesis, because four 
Ss who recalled equal numbers of agreements 
and disagreements are included among the 
negative cases. 

If these four Ss are eliminated from con- 
sideration, then 20 Ss (74.1 per cent) recalled 
more disagreements than agreements and 
seven Ss (25.4 per cent) recalled more agree- 
ments than disagreements; the probability of 
obtaining this outcome by chance is approxi- 
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raately .01. If the four Ss are distributed 
equally between the two groups, then p = .02. 
In any case the subhypothesis regarding recall 
under conditions of decision can be accepted 
at close to or better than the .05 level of 
confidence. 

For states of indecision we can compare only 
17 of the 32 Ss, since nine Ss never indicated 
that they were undecided, and of the remain- 
ing 21, two had no agreements and four had 
no disagreements in the indecision conditions. 
Here we find the predicted reversal in recall 
tendencies. Only five Ss (29.4 per cent) re- 
called more disagreements than agreements, 
but 12 Ss (70.6 per cent) recalled more agree- 
ments than disagreements. Evaluated as 
above, the subhypothesis regarding recall 
under conditions of indecision can be accepted 
at the .072 level of confidence. 

The method used above for testing the two 
subhypotheses treats the proportions of agree- 
ments or disagreements recalled by each S as 
scores which represent the strength of S’s 
tendency to recall either type of puzzle. It 
is clear that this method is an insensitive one, 
in that it ignores information in the scores 
about the strength of these recall tendencies. 
We next evaluate our data by means of the # 
test, which uses the information about strength 
of recall, although it has the disadvantage of 
combining proportions whose variances cannot 
be regarded as homogeneous, particularly 
where, as here, they are based on samples of 
unequal size (5). 

In the decision condition, the mean of 
proportions of disagreement puzzles recalled 
by the 31 Ss is .414. The mean proportion of 
agreement puzzles recalled is .279. The value 
of ¢ for the difference between these correlated 
means is 2.46 (Table 1), which for 30 df is 
significant at the .01 level, using a one-tailed 
test. For the 17 Ss who can be compared in the 
indecision condition, the mean proportion of 
agreements recalled is .505; the mean pro- 
portion of disagreements recalled is .277. Here 
t = 1.90, which for 16 df is significant at better 
than the .05 level. These results are again 
consistent with the hypothesis. 

One further statistical test of these data 
throws some light on whether the above results 
are seriously affected by our dealing with pro- 
portions based on small samples. The recall 
scores for decision-disagreement and decision- 
agreement puzzles were based on samples of 
mean size, N = 5.1 (range N = 1 w 8) and 
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TABLE 1 


Resutts oF ¢ Tests COMPARING RECALL AMONG 
Pars oF Types OF PUZZLES 





TABLE 2 


COMPARISON OF POOLED RECALL OF AGREEMENT AND 
DISAGREEMENT PUZZLES IN STATES OF DECISION 








MEAN DIFFERENCE 





TYPE OF 








| Decision- | | LNDECISION- 
Pus | DISAGREE- INDECISION | DISAGREE- 
| semeer AGREEMENT MENT 
Decision- | | 
agreement .134** .292tt | .060 
Decision-dis- 
agreement .195t .004 
Indecision- 
agreement | | .228* 








*s = 1.90, 16 df, p < .05 (one-tailed test). 
t# = 2.61, 20 df, » < .0S (two-tailed test). 
°° 4 = 2.46, 30 df, » < .01 (one-tailed test). 
tt # = 4.13, 19 df, p < .01 (two-tailed test). 


WN = 5.4 (range N = 2 to 8), respectively. 
Where proportions are given equal weight in 
averaging, without regard to differences in size 
of sample, the proportions based on the samples 
of smaller size are more likely than those based 
on larger samples to take extreme values, and 
therefore may have unjustifiably strong effects 
in either raising or lowering the mean of all 
proportions. In Table 2, we pool ail decision- 
disagreement puzzles and all decision-agree- 
ment puzzles, respectively, and compare the 
frequencies with which these puzzles are re- 
called, disregarding the fact that puzzles were 
recalled by different Ss. The effect of this proce- 
dure is to give greater weight in the comparison 
to recalls which are based on larger samples. 
The procedure appears to be statistically justi- 
fied, for it assumes that the frequency with 
which an individual recalls puzzles under a 
given treatment can be treated as a random 
sample drawn from a common population of 
individuals within that treatment. The tenabil- 
ity of this assumption is tested by considering 
whether the variations in frequencies of indi- 
vidual recall are no greater than what would 
be expected in randomly sampling homogene- 
ous material. The test for homogeneity applied 
to the data on recall of decision-agreement 
puzzles by separate Ss yields a chi square of 
25.045, which for 30 df is not significant. 
For recall by separate Ss of decision-disagree- 
ment puzzles, chi square = 29.629, 30 df, 
which is likewise not significant, indicating 
that it is legitimate to pool individual data 
under each class of puzzles, und to treat the 
data thus COMUNE 43 & suis Sars ~ePre- 


‘ 


sentative of a common population. As shown 


| 
| Nor Re- 


Tyre or Puzziz RECALLED | Torat 
| CALLED 








Agreement puzzles | 46 (29.1%) 112 (70.9%) | 188 (100%) 
Disagreement puz- | 73 (43.4%) | 95 (56.6%) | 168 (100%) 
zles 
Total 








119 | 207 326 





Note: x? = 7.22, 1 df, » < .01 (one-tailed test). 


by Table 2, the pooled recali of disagreement 
puzzles (43.4 per cent) is significantly greater 
(* < .01) than recall of agreement puzzies 
(29.1 per cent). It is interesting that the cor- 
responding mean proportions reported above, 
where the ¢ test was applied to unpooled data, 
are, in terms of percentages, 41.4 per cent 
and 27.9 per cent, respectively, suggesting 
that variations in sample size had negligible 
effects on the results. 

The same comparison is made in Table 3 
for those cases where Ss were in a state of 
indecision. Again the legitimacy of pooling 
data from separate Ss is supported by the 
results of chi-square tests for the homogeneity 
of individual recall (chi square = 22.013, 17 df, 
for indecision-disagreement puzzles; chi square 
= 14.040, 17 df, for indecision-agreement 
puzzles—neither chi square being significant). 
The predicted reversal in recall is again ap- 
parent in Table 3. A higher percentage -of 
agreement puzzles is recalled (50.0 per cent) 
than disagreement puzzles (32.0 per cent), 
this difference being significant at better than 
the .05 level. The corresponding mean propor- 
tions once more appear to be very close, being 
in percentage terms, 50.5 per cent and 27.7 
per cent, respectively. 

We have made no predictions comparing 
levels of recall across the decision and indeci- 
sion conditions. Theoretically, tension systems 
should not increase recall for decision-agree- 
ment puzzles or indecision-disagreement puz- 
zles. The mean proportions of recall for Ss who 
can be compared on these two sets of puzzles 
are .295 and .355, respectively, the difference 
of .060 (Table 1) being nonsignificant, which 
suggests that in the absence of effects due to 
tension systems, decision and indecision puzzles 
tend to be equally recalled. The percentages 
based on pooled recall of these puzzles are 31.5 
per cent and 38.8 per cent, respectively (ch’ 
anuare = .686, 1 df, not significant), which 4re 
sufficiently close to the prevewiig uit. pups: 
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TABLE 3 


COMPARISON OF POOLED RECALL OF AGREEMENT AND 
DISAGREEMENT PUZZLES IN STATES OF 
INDECISION 








Nor Re- 
CALLED 


Tora. 


| 
RECALLED 


Tyre or Puzziz 





29 (50.0%) | $8 (100%) 
34 (68.0%) | 50 (100%) 


| 29 (0.0%) | 
16 (32.0%) | 
| 


} 

| | 
Agreement puzzles | 
Disagreement puz- | 
zles | 
Total 


| 45 | 63 | 108 





Note: x? = 3.58, 1 df, » < .0S (one-tailed test). 


tions to reassure us that the latter have not 
been unduly affected by individual proportions 
which are based on small samples. 

Theoretically, tension systems should oper- 
ate to increase recall of agreement puzzles in 
the indecision condition, but should not affect 
recall of agreement puzzles in the decision 
condition. The mean proportions for Ss who 
can be compared for these two classes of 
puzzles are .571 and .279 respectively, t 
difference (.292) being significant at bette: 
than the .01 level (Table 1). We used a two- 
tailed test, since differences in recall across 
the indecision and decision conditions are not 
derivable from our theory without an addi- 
tional assumption, namely, that recall between 
the two conditions will be equal in the absence 
of effects from tension systems. The corre- 
sponding percentages based on pooled rer 
are 51.8 per cent and 30.5 per cer. (chi 
square = 6.337, 1 df, .02 > p> %i), which 
appears to indicate again that .he preceding 
mean proportions were not *:aduly affected by 
proportions based on sm7ii samples. 

In a similar fashizn, we can compare the 
mean proportior; for decision-disagreement 
puzzles, wherc tension systems are theoreti- 
cally oper:tive, with indecision-disagreement 
puzzles, where they are presumably inopera- 
tive. Here the mean proportions are very close, 
340 and .336 respectively, the difference (.004) 
being nonsignificant. The corresponding pooled 
values are 35.5 per cent and 35.2 per cent, 
respectively (chi square = .0016, 1 df, not 
significant). ‘Thus, we find no evidence of re- 
liably greater recall where tension systems 
are operative in situations of decision than 
where tension systems are inoperative in situ- 
ations of indecision. This, it should be noted, 
is in contrast to the preceding finding that the 
operation of tension systems in situations of 
indecision proOUuLes yitaiti TOG Lee “ 
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nonoperation of tension systems in situations 
of decision. 

The question now arises whether tension 
systems operate with equal effect in the 
decision and indecision conditions. If we com- 
pare the mean proportions for decision-dis- 
agreement puzzles (.372) and indecision-agree- 
ment puzzles (.567), the difference (.195) is 
significant at better than the .02 le: -!, using a 
two-tailed test (Table 1). There is some evi- 
dence that this difference was produced in 
part by proportions based on small samples. 
The corresponding pooled percentages are 
39.6 per cent and 51.6 per cent, respectively, 
their difference being reduced to 12.0 per cent 
(chi square = 2.232, 1 df, p = .14). While the 
reliability of this difference is open to question, 
the possibility is suggested that equivalent 
amounts of motivational energy will have a 
greater effect on recall in indecision states than 
in decision states. In other words, tension 
systems may affect recall more strongly when 
an individual is engaged in wish fulfillment, his 
cognitive structure being highly fluid, than 
when he is oriented toward action, and his 
cognitive structure is relatively nonfluid. 

Our hypotheses are stated in terms of 
psychological mechanisms which -* wu pe 
exhibited by any ners>;,. une would therefore 
be -tiived to explain the behavior of persons 
not conforming to the hypotheses in terms of 
their failure to adopt the set which the experi- 
mental manipulations were designed to pro- 
duce. Questionnaires distributed to Ss after 
each session attempted to ascertain how Ss 
viewed the experimental situation. To measure 
the degree to which Ss accepted the goal which 
we tried to induce, they were asked: “Have 
you ever lost interest or become impatient 
with the test?”; answers were indicated on a 
seven-point rating scale ranging from “‘never” 
to “frequently.” Scores were assigned to Ss’ 
responses, ranging from 7 for “never lost 
interest” to 1 for “frequently lost interest.” 
Since all Ss found themselves in disagreement 
with the group vote approximately half the 
time, any tension systems which they may 
have developed, related to the goal of attaining 
maximum agreement in voting, should remain 
to some degree unreduced at the time of recall. 
However the initial magnitude of Ss’ tension 
systems should be expected to vary with their 
level of interest in the assigned task—which 

~clss yntematic differences in the degree to 
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TABLE 4 for individ 1 ane grows psychology considered 
ComPARISON OF MEAN INTEREST ScorES FOR SS separatel’ fn the following discussion we wish 
WHose RECALL oF DECISION OR INDECISION to indic» ‘2 implications of the study, first for 


Puzzixs Was CONSISTENT OR INCONSISTENT 
WITH THe HyporHesis 


tases 








MEAN 


Type or RECALL Score Drrr. s 





Decision puzzles 
Consistent (N= 20) 4.95 1.41 2.58% 
Inconsistent (NV = 11) 3.54 


Indecision puzzles 


Consistent (NV = 12) 4.58 2.38 3.56°* 


Inconsistent (V = 5) 2.20 





*df = 29,9 < .02 (two-tailed test). 
°° df = 15, » < .01 (two-tailed test). 


which different Ss will manifest the effe- ; we 
have hypothesized. 

Analysis of responses to the “interes. item 
shows a significant relationship betr:en Ss’ 


expressed interest and whether their all con- 
farms ta the hunoth xaig (Tahle 4\.F he doe5. 


sion condition, Ss who recalled rela’ ely more 
disagreements than agreements. ..e., who 
showed the predicted Zeigarnik e’ ct, had sig- 
nificantly higher interest scores M = 4,95) 
than those who failed to sh w the effect 
i = 3.22). LUG UulccIRe UCCiWeeL LueSe 
means is significant at hette: than the .02 
level. In the indecisicn co”cition, Ss who 
recalled relatively more agr’ nents than dis- 
agreements, i.e., who shcw2d the predicted 
reversal of the Zeigarnik :ffect, were again 
significantly higher in im'erest (M = 4.75) 
than those who did not ;*rform as predicted 
(M = 2.20). This differsce is significant at 
better than the .O1 lew!. It is evident from 
these data that at lex some of the variance 
in our data can be at’ buted to the fact that 
the attempt to indue« 5s to accept the experi- 
mental goal did not “et with uniform success. 
One can conclude, ‘at the assumption that 
we are dealing wit* 1. psychological mechanism 
common to indi» ‘uals in general is not an 
untenable on. 


DISCUSSION 


The probler: ealt with in this paper is an 
instance of the more general one of how events 
in the socia] «c vironment impinge on psycho- 
logical processes in the individual. The attempt 


to ask que*! ons involving the interaction of 
individuai =": group variables, appears to lead 
to form:.i# ons with interesting consequences 





the psy modynamics of decision making, and 
secor {or the conceptual! treatment of motiva- 
tiom_ persistence in social groups. 
xychodynamics of decision making. We al- 
I ed earlier to Cartwright’s (2) theory of the 
facess involved in coming to a decision. This 
.eory, which has been elaborated and quanti- 
ied by Cartwright and Festinger (3), assumes 
that the person who is making a choice between 
two courses of action views himself first in one, 
then in another, of two overlapping situations. 
When the forces operating on the person in 
one situation exceed the forces in the other 
situation by a given amount, the person 
will act. 

In a subsequent paper, dealing with the 
so-called group decision experiments by Bave- 
las and others, Lewin (12) suggests an exten- 
sion of this theory according to which decision 
meking aadiuuunaily luvuives a reduciion vi 
one of the opposing sets of forces. In one of the 
Bavelas experiments Ss, in two different treat- 
ments, were presented with the alternative of 
serving or not serving certain new foods. It 
was found that in the treatment in which Ss 
engaged in an act of decision about using the 
new foods, a greater proportion served these 
foods, and persisted in doing so, than did those 
in the treatment which omitted the act of 
decision. On the strength of these results, 
Lewin argues that decision making must mean 
not merely that the forces toward one alterna- 
tive have become stronger than the forces 
toward the other alternative, but that “the 
potency of one alternative has become zero 
or is so decidedly diminished that the other 
alternative and the corresponding forces 
dominate the situation....If the opposing 
forces in a conflict merely change so that the 
forces in one direction become slightly greater 
than in the other direction, a state of blockage 
or extremely inhibited action results rather 
than that clear one-sided action which follows 
a real decision” (12, p. 336). 

In Lewin’s system of motivation, needs or 
tension systems are the sources of energy 
which underlie psychological forces. If de- 
cision making involves reducing the potency 
of one set of psychological forces, a further 
problem can be raised within this system 
about what happens to the tension systems 
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underlying these forces when decisions arc 
made. The present study, which makes use of 
the fact that in a sociai group behavier beyond 
a choice point may occur with or without a 
prior act of decision, sheds light on this prob- 
lem. On the assumption that indecision corre- 
sponds to an individual’s being located in a 
fluid behavioral field, we derived the conse- 
quence that when he is unable to make a 
decision, motivationa)l energies should be 
channeled into wish fulfillment. The act of 
decision making, we assumed, stabilizes the 
behavioral field and makes cognitive restruc- 
turing difficult, so that if the organism is to 
reduce its need tensions, it must channel 
energy into action. From this standpoint the 
act of decision making can be viewed as in- 
volving a mechanism for controlling the dis- 
position of motivational energy into tendencies 
toward wish fulfillment versus tendencies to- 
ward action. The experiment reported here 
attempted to isolate these two modes of chan- 
neling energy, and the results appear to sub- 
stantiate Lewin’s suggestion that, contrary to 
the prevalent view, action cannot be regarded 
as a direct result of motivation, but requires 
mediation by decision or some equivalent 
pivucess. Tis wuuiu alvuulii iui LUG eLIECLS OF 
the presence or absence of decision making 
upon persistence or nonpersistence of action 
in the Bavelas experiments, and upon recall of 
unfinished or finished tasks in the present 
study. 

Motivational persistence in social groups. 
Theorists of social organization such as Simon 
(18) and Barnard (1) have indicated the need 
for a more adequate psychological treatment 
of decision making for dealing conceptually 
with the structure of organizational activities. 
After discussing the ubiquitousness of de- 
cision making in social life, and vainly consult- 
ing the psychological literature for a treatment 
of the processes involved, Barnard observes 
that ‘‘decisive behavior, as contrasted with 
responsive behavior, seems to have received 
little attention in the psychologies” (1, p. 30). 

Recent work on group dynamics raises a 
similar theoretical demand. Groups operating 
on the basis of consensus have been studied in 
which the member is allowed to decide about 
alternative paths before he engages in activi- 
ties toward a goal. These groups have been 
contrasted with “autocratic” ones in which 
the decision-making step is more frequently 
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Shuai cireuited with decisions being made for 
the member about the path he is obiiged tc 
icliow im acting toward a goai. Lippitt (14) 
found that in “democratic” or decision making 
groups, individuals showed greater persistence 
in their work than did members of “auto- 
cratic” groups. The work of Coch and French 
(4) and Maier (15) give further evidence that 
persistence in goal striving—one aspect of 
what we generally designate as morale—is 
associated with the opportunity of group 
members to make decisions relevant to their 
goals. 

The present study, which focuses on the 
effects of decision making on internal processes 
of the individual, appears to clarify to some 
extent why decision making in a group should 
produce the above results. The hypothesis 
investigated here implies that a significant 
dimension of a group is the degree to which it 
constitutes a psychological environment in 
which at one extreme meambere are coseicd 
toward or away from goals or avoidances. 
With respect to the activity of decision making, 
this characteristic of a group will have the 
following consequence: If the group is one in 
which individuals are carried through points 
ot deciz:on, members will tend to find them- 
selves in fluid psychological situations, and 
tend to channel motivational energies into 
wish fulfillment. If the group is one in which 
members can engage in the activity of decision 
making, they will find themselves in less fluid 
psychological situations and show greater 
“action orientation.” 

The indecisiveness of members in the present 
experiment resulted from the fact that the 
group required them to act even though they 
had insufficient information on which to base 
a decision. In this respect the present situation 
differs from that of “autocratic” groups in 
which members are not allowed to make 
decisions. Other conditions which affect the 
degree of decisiveness of individuals have been 
described in sociological terms by Durkheim 
(6) in his discussion of anomie. An experi- 
mental attempt to reproduce anomie and 
related characteristics of groups, and to in- 
vestigate their effects from the standpoint of 
decision making and persistence in action, has 
been reported in a recent publication (10). 


SUMMARY 


In attempting to interpret results of an 
earlier experiment, the hypothesis was sug- 
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chat as a result of decision making, 
mot-vaciona! energy tends to be channeled 
inte action, and that in the absence of de- 
cision making, motivational energy tends to be 
channeled into wish fulfillment. To isolate 
these hypothesized effects, behavior of Ss was 

rapared in a group situation under condi- 

ions where Ss were able or unable to make 

decisions. The channeling of motivation into 
action was conceptually coordinated to the 
tendency to recall unfinished tasks; the chan- 
neling of motivation into wish fulfillment, to 
the tendency to recall finished tasks. 

The goal set for Ss in the present experiment 
was that of voting so as to agree with an- 
nounced majority decisions. Tendency to recali 
finished tasks was operationally measured by 
recall of agreements, and tendency to recall 
unfinished tasks by recall of disagreements. 
Three modes of statistical analysis were em- 
ployed in an effort to deal with problems which 
arose because our scores were based on propor- 
tions. These tests showed that the hypothesis 
concerning the tendency to recall disagreements 
in decisiun states and the contrasting tendency 
to recall agreements in indecision states could 
be accepted at close to, or better than, the .05 
ievel. A significant relationship was found, too, 
between degree of acceptance by Ss of the 

AVTL« rreatee *i'y wD ‘pure geai am w.euler ~ 
not their behav or conformed with the hypotk- 
esis, indicatiny; that negative cases were due 
in part to failure to induce all Ss to accept the 
experimental situation. Iraplications of these 
findings for the psychodynamics of decision 
making, and for the explanation of group 
effects on the motivational persistence of 
members, were discussed. 


geste 
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THE INFLUENCE OF ROLE PLAYING ON OPINION CHANGE! 





IRVING L. JANIS anp BERT T. KING 
Yale University 


induced to play social roles in which they 
express ideas that are not necessarily in 
accord with their private convictions. That 
certain types of role-playing experiences can 
facilitate changes in personal opinions has 
been suggested by various impressionistic 
observations (e.g., Myers [8]). In recent 
years, psychodramatic techniques which in- 
volve role playing nave been developed for 
use in adult education programs, leadership 
training, employee counseling, and psycho- 
therapy (1, 5, 6, 7, 9). The usual procedure 
consists of having persons in a group play 
specified roles in a simulated life situation. 
One of the main values of this role-playing 
device, according to its proponents, is that it 
‘as @ Cuuecciuve uuiuence oN various Deilefs 
and attitudes which underlie chronic difficul- 
ties in human relations (cf. Maier [6]). 
As yet little is known about the conditions 
under which role playing leads to actual 


1 
Lbaces 


periment was designed to investigate the 
effects of one type of demand that is fre- 
quently made upon a person when he is in- 
duced to play a social role, namely, the re- 
quirement that he overtly verbalize to others 
various opinions which may not correspond 
to his inner convictions. 

As a preliminary step in exploring the effects 
of role playing, one of the present authors in- 
terviewed a group of collegiate debaters who, 
as members of an organized team, repeatedly 
were required to play a role in which they 
publicly expressed views that did not cor- 
respond to their personal opinions. Most of 
the debaters reported that they frequently 
ended up by accepting the conclusions which 


T MANY everyday situations, people are 


aa pow sik puns. “a aie prrvorlit Ca~ 


1 This study was conducted at Yale University as 
part of a program of research on factors influencing 
changes in attitude and opinion. The research program 
is supported by a grant from the Rockefeller Founda- 
tion and is under the general direction of Professor 
Carl I. Hovland, to whom the authors wish to express 
appreciation for helpful suggestions and criticisms. 
The authors also wish to thank Professor Fred D. 
Sheffield for valuable suggestions during discussions 
preparatory to designing the experiment. 
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they had been arbitrarily assigned to defend. 
Myers’ (8) impressionistic account of the 
improvement in morale attitudes produced by 
participation in an Army public-speaking 
course points to the same phenomenon and 
suggests that attitude changes may occur 
even when role playing is artificially induced. 
If true, it would appear that “saying is be- 
lieving”—that overtly expressing an opinion 
in conformity to social demands will influence 
the individual’s private opinion. Conse- 
quently, it seemed worth while to attempt to 
investigate the effects of this type of role 
playing in a more controlled laboratory situa- 
tion where, if the alleged gain from role play- 
ing occurs, it might be possible to isolate the 
critical factors and to explore systematically 
the mediating mechanisms. 

The role-playing effects described above 
have not as yet been verified by systematic 
research. If verified, they would still remain 
open to a variety of alternative explanations. 
Tor msn, imducing ihe imdividual to piay 
a role in which he must advocate publicly a 
given position might guarantee exposuze to 
one set of arguments to the exclusion of others. 
An alternative possibility, however, is that 
even when exposed to the same persuasive 
communications, people who are required to 
verbalize the content to others will tend to be 
more influenced than those who are only 
passively exposed. In order to test this hypoth- 
esis, the present experiment was designed 
so that communication exposure would be 
held constant by comparing the opinion 
changes of active participants and passive 
controls who were exposed to the same com- 
munications. 


METHOD AND PROCEDURES 


An initial questionnaire, which was administered 
as an opinion survey in a large classroom of male 
college students, contained a series of questions con- 
cerning expectations about the future. Included in this 
“before’”’ questionnaire were the following key opinion 
items, which dealt with the subject matter of the three 
communications to which the experimental groups 
were subsequently exposed: 

Item A: During the past year a number of movie 
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theaters were forced to go out of business as result of 
television competition and other recent developments. 
At the present time there are about 18,000 movie 
theaters remairing. How many commercial movie 
theaters do you think will be in business three years 
from now? 

Item B: What is your personal estimate about the 
total supply of meat that will be available for the civilian 
population of the United States during the year 1953? 
(...— per cent of what it is at present.) 

Iiem C: How many years do you think it will be 
before a completely effective cure for the common cold 
is discovered? 

The experimental sessions were held approximately 
four weeks after the initial questionnaire had been 
filled out, and were represented as being part of a re- 
search project designed to develop a new aptitude 
test for assessing oral speaking ability. The subjects 
(Ss) were asked to give an informal talk based on an 
outline prepared by the experimenters (Zs) which stated 
the conclusion and summarized the main arguments to 
be presented. The arguments were logically relevant 
but highly biased in that they played up and inter- 
preted “evidence” supporting only one side of the issue. 
Each active participant was instructed to play the role 
of a sincere advocate of the given point of view, while 
two others, who were present at the same experimental 
session, listened to his talk and read the prepared out- 
line. Each S delivered one of the communications and 
was passively exposed to the other two. In order to 
prevent selective attention effects, the active partici- 
pant was not told what the topic of his talk would be 
until his turn came to present it. He was given about 
three minutes to look over the prepared outline, during 
which time the others (passive controls) also were re- 
quested to study duplicate copies of the same outline 
so as to be prepared for judging the adequacy of the 
speaker’s performance. After the first talk was over, 
another S was selected to present the second com- 
munication, and then the remaining S presented the 
third communication, the same procedures being fol- 
lowed in each case. 

Immediately after the last talk was finished, Ss 
were given the “after” questionnaire, much of which 
was devoted to rating the performance of each speaker. 
The key opinion items were included among numerous 
filler items, all of which were introduced as questions 
designed to provide information about the student’s 
interests and opinions concerning the three topics so 
as to enable the investigators to select the most ap- 
propriate topic for future applications of the oral 
speaking test. 

In all three communications, the conclusion specified 
an opinion estimate which was numerically lower than 
that given by any of the students on the “before” test. 
Thus, all active participants were required to argue in 
favor of an extreme position which differed from their 
initial beliefs. The influence of each communication 
could readily be observed by noting the degree to 
which the students in each group lowered their opinion 
estimates on the “after” test. 

The basic schema of the experiment is shown in 
Table 1. In each row of the table which represents 
exposure to a given communication, there is one group 
of active participants and two contrasting groups which, 
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TABLE 1 
SCHEMA OF THE EXPERIMENTAL CONDITIONS 








Group C 
(NW = 30) 


Grover A Groupe B 
(N = 31) (NW = 29) 





active 
partici- 
pants 
passive 
controls 


passive 
controls 


passive 
controls 


Communication A: 
movie theaters 


active 
partici- 
pants 
passive 
controls 


passive 
controls 


Communication B: 
meat supply 


active 
partici- 
pants 


Communication C: 
cold cure 


passive 
controls 








when combined, form the group of passive controls. 
In effect, the experimental treatments were repeated 
with different communication contents, providing 
three separate instances of active versus passive ex- 
posure, although the same Ss were used throughout. 

In order to obtain some information for checking 
on selective attention effects, a variation of the passive 
control condition (not represented in the table) was 
introduced into the experiment by using a small sup- 
plementary group who listened and took notes on all 
three talks. In addition, base-line data for assessing the 
effectiveness of the communications were obtained 
from a comparable group of “pure” controls who were 
not exposed to any of the communications. 


RESULTS AND DISCUSSION 


Effects of active participation. Initially, on 
each of the three key items in the precom- 
munication questionnaire, the difference be- 
tween the active participation group and the 
passive control group was nonsignificant. The 
opinion changes observed after exposure to 
the three communications are shown in Table 
2.2 The results indicate that in the case of two 
of the three communications (A and B), the 
active participants were more influenced than 
the passive controls. For both communica- 


2 The table does not include the data on the “pure” 
(unexposed) control group. The net changes for this 
group were approximately zero in the case of all three 
key items, and the corresponding net changes for the 
active participants and the passive controls (shown in 
the last rows of the table) were significantly greater 
(p’s range from .10 down to <.01). Hence, all three 
communications had a significant effect on the opinions 
of those who were either actively or passively exposed 
to them. 

The probability values reported throughout this 
paper are based on one tail of the theoretical distribu- 
tion. Whenever intergroup comparisons were made 
with respect to the net percentage who changed by a 
given amount, the reliability of the difference was 
tested by the formula presented in Hovland, 
Lumsdaine, and Sheffield (3, p. 321). 
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TABLE 2 
COMPARISON OF ACTIVE PARTICIPANTS WITH PassIVE CONTROLS ON AMOUNT OF CHANGE IN OPINION ESTIMATES 








ComMUNICATION A: 
(Movie THEATERS) 


ComMUNICATION C: 
(Comp Cure) 


COMMUNICATION B: 
(MEAT SHORTAGE) 





CHANGES IN OPINION 


Estmartest ACTIVE PASSIVE 


PARTICIPANTS CONTROLS PARTICIPANTS CONTROLS PARTICIPANTS 
(N = 29) 


(N = 3!) (N = 57)* 


PASSIVE 
CONTROLS 
(N = 53) 


ACTIVE PASSIVE ACTIVE 


(N = 57) (N = 30) 





Sizable increase 
Slight increase 
No change 
Slight decrease 
Sizable decrease 
Total 
Net change (% increase 
minus % decrease) 
Slight or sizable 
change 
Sizable change 
p 


23% 
100% 


—-711% 
—45% 


—58% 
—21% 


7% 
10% 
13% 
23% 
47% 

100% 


0% 
7% 
24% 
276% 
411446% 
100% 


2% 
14% 
16% 
49% 
19% 

100% 


—53% 


—40% 
> .30 


—62% —52% 


—45% 


—414% —17% 
.01 





* The number of cases in each passive control group is slightly smaller than expected from the N’s shown in Table 1 because the 
data from a few cases were inadequate and hence were eliminated from the analysis (e.g., the individual failed to give an answer to 


the particular question). 


t The “net change (slight or sizable)’’ is defined as the percentage changing in the direction advocated by the communication 
minus the percentage changing in the opposite direction. The ‘“‘net sizable change’’ in the case of Communication A refers to the dif- 
ference in the percentages who lowered and raised their estimate by 5,000 (movie theaters) or more. For Communication B, a sizable 
change was 25 (per cent) or more; for Communication C it was S$ (years) or more. 


tions, the differences in net sizable change are 
statistically reliable, and the differences in net 
(slight or sizable) change, although non- 
reliable, are in the expected direction. 

In the case of the third communication 
(C), the two groups showed approximately 
the same amount of opinion change. But addi- 
tional findings (based on confidence ratings 
given by each S immediately after answering 
the key opinion questions) indicate that the 
active participants who presented Communi- 
cation C, like those who presented the other 
two communications, expressed a _ higher 
level of confidence in their postcommunication 
estimates than did the corresponding passive 
controls. Table 3 shows the net changes in 
confidence ratings for each of the three com- 
munications in terms of a breakdown that 
takes account of the direction and magnitude 
of opinion change. The breakdown was neces- 
sary inasmuch as a successful communication 
would be expected to increase the confidence 
only of those who changed their opinions in 
the direction advocated by the communica- 
tion. The net change in confidence shown for 
each subgroup is based on a comparison of pre- 
and postcommunication ratings given by each 
S, and was computed by subtracting the per- 


centage who showed a decrease in confidence 
from the percentage who showed an increase 
in confidence. In general, the findings in Table 
3 reveal a consistent pattern for all three 
communications: in every instance, active par- 
ticipation tended to have at least a slight posi- 
tive effect with respect to increasing the con- 
fidence of those whose opinion estimates were 
influenced by the communication. The results 
indicate that active participation resulted in 
a significant gain in confidence, particularly 
among those students whose opinion estimates 
were markedly influenced by Communication 
C This finding is especially striking in view 


* For the entire group of active participants who 
were exposed to Communication C, there was a net 
increase in confidence of 37 per cent; the corresponding 
net increase for the entire group of passive controls 
was only 1344 per cent. This difference was due en- 
tirely to the marked gain in confidence manifested by 
those students in the active group who had changed 
their opinion estimates in the direction advocated by 
the communication. The results in the first row of the 
table indicate that, among the students whose opinion 
estimates were uninfluenced by Communication C, the 
active participants showed a small net decrease in 
confidence which was equal to that shown by the 
passive controls. The next row of Table 3 indicates 
that, among those students who decreased their opinion 
estimates by at least one-half year or more after ex- 
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TABLE 3 
COMPARISON OF ACTIVE PARTICIPANTS WITH PASSIVE CONTROLS ON AMOUNT OF CHANGE IN CONFIDENCE 











NET CHANGE IN CONFIDENCE 


SusGRovuP BREAK- 


(PER CENT INCREASE MINUS PER CENT DECREASE) 





DOWN ACCORDING COMMUNICATION A 


TO CHANGES IN 


COMMUNICATION B COMMUNICATION C 





Oprnton EsTIMATES ACTIVE PASSIVE 


PARTICIPANTS CONTROLS 


PARTICIPANTS 


ACTIVE PASSIVE 
PARTICIPANTS CONTROLS 


PASSIVE 
CONTROLS 


ACTIVE 





. Uninfluenced: 

opinion estimates 
increased or un- 
changed 

. Influenced: 

opinion estimates 
slightly or siza- 
bly decreased 

Gain from active 
participation 

. Highly influenced: 

opinion estimates 
sizably decreased 

Gain from active 
participation 


—5% 


—12% 0 
(N = 18) 


(N = 8) 


+9% —10% 
(N = 23) (N = 39) 


+19% 
-~1% 


(N = 14) 
+31% 


—38% 
(N = 13) 


—11% 
(V = 9) 


-11% 
(N = 18) 


+6% 
(N = 18) 


0% 
(N = 9) 


+20% 
(N = 20) 


+15% 


+5% 
(N = 39) 


+57% 
(N = 21) 


+31% 


+26% 
(N = 35) 


+25% 
(N = 12) 
+25% 


+6414% +15% 
(N= 14) (N = 27) 
+49146% 


0% 
(N = 11) 





of the fact that the opinion change results for 
Communication C (Table 2) failed to show any 
gain from active participation. 

Insofar as confidence ratings can be re- 
garded as indicators of the degree of conviction 
with which the new opinions are held, the 
positive findings based on the opinion change 
data for Communications A and B are par- 
tially confirmed by the confidence change 
data based on Communication C. Thus, the 
data based on all three communications con- 
tribute evidence that the effectiveness of the 
communications (as manifested by opinion 
changes or by confidence changes) tended to 
be augmented by active participation. 

Although Ss were not told what their topic 
would be until they were about to begin giving 
the talk, it is possible that the ego-involving 
task of presenting one of the talks may have 





posure to Communication C, the active participants 
showed a greater net increase in confidence than the 
passive controls; the difference of 31 per cent approaches 
statistical significance (p = .07). Finally, the last row 
of the table shows that an even greater difference in 
confidence changes emerges when the comparison is 
limited to those students who decreased their opinion 
estimates by five years or more. (The 49}4 per cent 
difference is reliable at beyond the .05 confidence 
level.) Further analysis of the subgroup data indicated 
that the differences shown in this table could not be 
attributed to statistical artifacts arising from initial 
differences between the various subgroups. 


given rise to emotional excitement or other 
interfering reactions which could have had the 
effect of reducing the Ss’ responsiveness when 
passively exposed to the other two communi- 
cations. This possibility appears extremely 
improbable, however, in the light of supple- 
mentary control observations: 

1. Some of the passive controls had been 
exposed to the communications before giving 
their own talk, while others were passively 
exposed afler having given their own talk. 
Nonsignificant differences were found in the 
amount of opinion change shown under these 
two conditions. 

2. The results from the passive controls 
were “replicated” by the results from an 
independent group of 16 students who did 
not give an oral presentation, but who were 
asked to follow the prepared outline carefully 
and to note down the main arguments given 
by each of the three speakers. Despite the fact 
that their notes were fairly complete and indi- 
cated a relatively high degree of attention to 
the content of all three communications, 
these supplementary controls displayed ap- 
proximately the same amount of opinion 
change as the original group of passive con- 
trols.‘ 


4It is conceivable, of course, that the activity of 
taking notes on the talks might have interfered with 





INFLUENCE OF ROLE PLAYING ON OPINION CHANGE 


Observations pertinent to explanatory hypoth- 
eses. Many different types of speculative 
hypotheses could be put forth to account for 
the facilitating effects of active participation, 
postulating a gain in attention and learning 
from overtly rehearsing the communication, 
or a gain in comprehension from reformulating 
the arguments in one’s own words, or a gain 
in motivation from playing the role of com- 
municator, etc. Some supplementary ob- 
servations were made for the purpose of ex- 
ploring various factors which might provide 
leads to the key mediating mechanisms. 
Although far from conclusive, the evidence 
derived from these observations provides a 
preliminary basis for selecting explanatory 
hypotheses which warrant further experimen- 
tal analysis. 

The findings based on the supplementary 
controls (who were required to take notes on 
the three talks) suggest that variation in 
attention level probably was not a crucial 
factor that could explain the participation 
effects observed in the present experiment. 
More promising clues were discovered by 
taking account of differences in the types of 
reactions evoked by the three communica- 
tions. We have seen that in the case of Com- 
munications A and B, a clear-cut gain from 
active participation was manifested by changes 
in opinion estimates; but, in the case of Com- 
munication C, opinion estimates were un- 
affected, the gain being manifested only in 
the form of increased confidence. With a 
view to discovering some differentiating factor, 
we examined the available evidence bearing 
on the question of why active participation 
might be more effective under certain stim- 
ulus conditions (represented by Communica- 
tions A and B) than under other conditions 
(represented by Communication C). 

The first step in this inquiry was to examine 
E’s notes on: (a) the active Ss’ behavior while 
giving their talks, and (6) Ss’ statements in 
the informal interviews conducted at the end 





responsiveness to the persuasive content of the com- 
munications. While this possibility cannot be excluded, 
it seems implausible inasmuch as our Ss were college 
students who had had considerable practice in taking 
notes during lectures. Educational research on the 
effects of note taking indicates that this form of ac- 
tivity generally has a beneficial rather than a detri- 
mental effect on the student’s ability to absorb the 
content of an oral communication (2). 
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of each experimental session. These observa- 
tions provide two suggestive leads: 

1. The active participants who presented 
Communication C seemed to engage in less 
improvisation than those who presented the 
other two communications. The Communica- 
tion C group appeared to adhere much more 
closely to the prepared outline, making little 
attempt to reformulate the main points, to 
insert illustrative examples, or to invent addi- 
tional arguments. 

2. Active participants in the Communica- 
tion C group seemed to experience much more 
difficulty than the other groups in presenting 
their talks. During their performance they 
appeared to be more hesitant and tense. 
Afterwards, they expressed many more com- 
plaints about the task, claiming that their 
topic was more difficult to present than either 
of the other two. In general, these subjects 
seemed less satisfied with their performance 
than those who presented the other two topics. 

The first observation suggests that mere 
repetition of a persuasive communication 
may have little or no effect as compared with 
an improvised restatement. This observation 
is in line with some suggestive findings from 
an opinion change study by Kelman (4) in 
which seventh-grade students were given a 
communication, and, immediately afterwards, 
were offered various incentives to write essays 
in support of the communicator’s position. 
Kelman observed that the essays written by 
the group which showed the greatest amount 
of opinion change tended to be longer, to 
include more improvisation, and to be of 
better over-all quality (as rated by several 
judges) than the essays written by the other 
experimental groups. 

Reformulating and elaborating on the com- 
munication might be a critical factor in pro- 
ducing the gain from active participation, 
perhaps because the communicatee is stimu- 
lated to think of the kinds of arguments, 
illustrations, and motivating appeals that he 
regards as most convincing. The importance 
of the improvisation factor in relation to par- 
ticipation effects could not be investigated 
further with the data at hand from the present 
experiment, but is currently being studied by 
the present authors in another experiment 
that is specifically designed to compare the 
effects of different types of active participa- 
tion. 
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TABLE 4 


Sevr-Ratincs oF ACTIVE PARTICIPANTS IN Each EXPERIMENTAL GROUP 








EXPERIMENTAL Groups (ACTIVE PARTICIPANTS) 





Se.r-RATING RESPONSE 


COMMUNICATION A: 
(Movie Theaters) 


ComMMUNICATION C: 
(Cold Cure) 


COMMUNICATION B: 
(Meat Supply) 





(N = 31) (N = 29) (N = WO) 

1. Over-all performance was at least 94% 83% 63% 
“satisfactory” 

2. Rarely or never spoke in a monotonous 64% 76% 53% 
tone of voice 

3. Rarely or never incoherent in pre- 74% 83% 57% 
senting arguments 

4. No distortions or misinter pretations of 32% 52% 13% 
arguments in the prepared outline 

5. No omissions of any of the main argu- 74% 72% 70% 
ments 

6. Succeeded in giving the impression of 52% 52% 43% 
being “‘sincere”’ 

Combined rating on all six items:— 39% 52% 13% 


five or more favorable self-ratings 





With respect to the second observation, it 
should be noted that there may have been 
an objective basis for the greater dissatisfac- 
tion experienced on Communication C be- 
cause of the greater amount of unfamiliar 
technical material it contained. The “cold 
cure” outline referred to a great many tech- 
nical details concerning the cold virus, anti- 
biotics, allergic reactions, and antihistamines. 
Many of these details were probably un- 
familiar to Ss, and consequently it may have 
been difficult for them to “spell out” the im- 
plications of the arguments. In contrast, the 
outlines for the other two topics contained 
very little technical material, relying mainly 
on arguments that were likely to be quite 
familiar to college students. 

Systematic evidence relevant to Ss’ per- 
ception of the difficulty of presenting each 
communication was obtained by making use 
of the self-rating schedule which each student 
filled out after exposure to the three communi- 
cations. Table 4 shows the percentage in each 
experimental group who rated their own per- 
formance as adequate or satisfactory on each 
of six self-appraisal items. 

The most comprehensive question was the 
following: ‘‘What is your over-all rating of the 
informal talk given by this speaker—how 
good a job do you think he did in presenting 


his material? Excellent; _. Very 
Good; —— Satisfactory; _._ Poor; —— 
Very Poor.” 


ion nel 


The percentage who rated themselves as 
“satisfactory” or better (shown in the first 
row of the table) was significantly lower for 
the group who presented Communication C 
than for the groups who presented Communi- 
cations A and B (pg = .002 and .04, respec- 
tively). On the remaining five items, each of 
which dealt with a specific aspect of the 
speaker’s performance, the Communication 
C group also tended to rate themselves lower 
than did the other two groups. (On the com- 
bined rating, based on all six items, the per- 
centage differences are statistically significant 
at beyond the .05 confidence level.) The find- 
ings consistently indicate that the students 
in the Communication C group felt less satis- 
fied with their oral speaking performance than 
did those in the other two groups. Since the 
group differences in self-ratings tend to paral- 
lel the group differences in amount of gain 
from active participation, the results suggest 
that satisfaction with one’s own performance 
may be a critical factor that determines the 
magnitude of participation effects. 

Further evidence which supports this 
hypothesis was obtained from an analysis of 
individual opinion changes, comparing active 
participants with high and low self-ratings 
for each of the three communications. For 
example, among the active participants who 
presented Communication C, there were 18 
students whose self-ratings were compara- 
tively “high” (three to six favorable responses) 
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and 12 cases whose self-ratings were predom- 
inantly “low” (zero, one, or two favorable 
responses); 55 per cent of the “highs” as 
against only 17 per cent of the “lows” showed a 
sizable net opinion change in the direction 
advocated by the communications (p = .05). 
In general, the comparisons based on all three 
communications consistently indicate that a 
greater amount of opinion change occurred 
among those active participants who rated 
their oral speaking performance as satisfac- 
tory or better. Active participants who felt 
that they performed poorly, on the other 
hand, failed to show any more opinion change 
than the passive controls, and, in the case of 
Communication C, showed markedly less 
change than the passive controls (p = .07). 

During the experimental sessions there were 
no apparent sources of external social rewards 
from the environment. Since the others pres- 
ent remained silent, the active participant had 
no opportunity to know how they were react- 
ing to his talk, except possibly by subtle signs 
from their facial expressions or from their 
bodily movements. But even in the absence 
of any external cues to social approval, it 
seems probable that anticipations concerning 
such approval would occur if the individual 
felt that he was performing well, as expressed 
in his self-ratings. Thus, expectations of 
favorable audience reactions may have oc- 
curred less frequently among Ss who were 
required to perform the relatively difficult 
task of presenting the unfamiliar technical 
material in Communication C than among 
those who were required to perform the less 
difficult task of presenting Communication A 
or B. The increase in opinion change produced 
by role playing might be mediated by the 
individual’s sense of achievement or his elated 
feelings about the adequacy of his oral per- 
formance. One hypothesis that would follow 
from this assumption is that when a person 
conforms outwardly to social demands by 
playing a role which requires him to advocate 
a given opinion, he will begin to believe what 
he is saying if he is made to feel that he says 
it well. 

Although the above hypothesis is suggested 
by the supplementary correlational findings, 
it will obviously remain open to question until 
tested by more precise methods. One cannot 
be certain that the responses used to assess 
“satisfaction” represent a separate variable 





217 


which is causally related to opinion changes. 
Acceptance of the communication might be a 
common factor which inclines those who are 
most influenced to perceive themselves as 
having performed well, in which case the self- 
ratings might merely reflect the same thing as 
the measures of opinion change. Moreover, 
even if the two variables can be varied and 
measured independently, the possibility re- 
mains that the observed relationship may be 
due to some third variable, such as amount of 
improvisation. 

As was noted earlier, the group of active 
participants who showed the least amount of 
opinion change not only expressed a low de- 
gree of satisfaction but also displayed a rela- 
tive absence of improvisation in their oral 
performances. Either the “satisfaction” factor 
or the “improvisation” factor might prove to 
be a critical mediating variable. Before draw- 
ing a definite conclusion, it is necessary to in- 
vestigate each factor experimentally—for 
instance, by giving the Ss “expert” perform- 
ance ratings which raise or lower their feelings 
of satisfaction, and by using instructions which 
increase or decrease the amount of improvisa- 
tion. These methods are currently being 
employed in our further research on the effects 
of role playing. 

There is another important problem which 
arises from the findings in the present experi- 
ment and which also requires systematic in- 
vestigation: Does social role playing facilitate 
the internalization of externally imposed 
value judgments, mores, and taboos? The 
persuasive communications used in this study 
dealt with relatively impersonal beliefs about 
the future, and the main findings show that ac- 
ceptance of opinions of this sort was markedly 
increased by experimentally induced role 
playing. It remains problematical, however, 
whether active participation also influences 
the acceptance of opinions and attitudes that 
are more directly tied up with daily life ac- 
tivities, interpersonal relationships, and emo- 
tionally charged dilemmas. 

Obviously, it is unsafe to generalize widely 
from a single exploratory study based on the 
opinion changes of college students pro- 
duced in a somewhat artificial test situation. 
Nevertheless, the present experiment pro- 
vides preliminary evidence indicating that 
verbal conformity elicited by role playing can 
significantly influence the acceptance of new 
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beliefs. Under certain specifiable conditions 
which await further investigation, it seems to 
be true that “saying is believing.” 


SUMMARY AND CONCLUSIONS 


The experiment was designed to determine 
whether or not overt verbalization, induced by 
role playing, facilitates opinion change. Male 
college students were assigned at random to 
two main experimental groups: (a) active par- 
ticipants, who, with the aid of a prepared out- 
line, played the role of a sincere advocate of 
the given point of view, and (6) passive con- 
trols, who silently read and listened to the 
same communication. In the experimental 
sessions, three different communications were 
used, each of which argued in favor of a 
specific conclusion concerning expected future 
events and was presented by a different active 
participant. Opinion measures obtained at the 
end of the session were compared with the 
“before” measures obtained about one month 
earlier. 

In general, the active participants tended 
to be more influenced by the communications 
than were the passive controls. In the case of 
two of the communications the active par- 
ticipants showed significantly more opinion 
change than the passive controls. In the case 
of the third communication, both groups 
showed approximately the same amount of 
opinion change, but active participation, 
nevertheless, tended to increase the level of 
confidence of those whose opinion estimates 
were influenced by the communication. The 
main findings, together with various methodo- 
logical checks, support the hypothesis that 
overt verbalization induced by role playing 
tends to“augment the effectiveness of a per- 
suasive communication. 

Additional observations were analyzed in 
order to explore possible mediating factors 
underlying the gain in opinion change due to 
active participation. From behavioral records 
and interviews, two suggestive leads emerged. 
In those cases where role playing produced a 
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marked increase in opinion change: (a) the 

individual displayed a relatively great amount 

of improvisation in his talk, and (5) he felt 
comparatively well satisfied with his oral 
speaking performance. The first factor sug- 
gests that the gain from role playing may occur 
primarily because the active participant 
tends to be impressed by his own cogent 
arguments, clarifying illustrations, and con- 
vincing appeals which he is stimulated to 
think up in order to do a good job of “selling” 
the idea to others. The second factor suggests 
an alternative explanation in terms of the 
rewarding effects of the individual’s sense of 
achievement or feelings of satisfaction with 
his performance in the role of active partici- 
pant. Additional evidence pertinent to the 
second factor, based on a self-rating question- 
naire which the Ss filled out immediately 
after giving the talk, consistently indicated 
that the greatest amount of opinion change 
occurred among those active participants who 
felt that their oral speaking performance was 
satisfactory. Both the “improvisation” factor 
and the “satisfaction” factor warrant further 
investigation. 
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PERSONALITY CHANGES FOLLOWING TRANSORBITAL LOBOTOMY’ 


HARRY W. ALLISON anp SARAH G. ALLISON 
University of Oklahoma 


LTHOUGH transorbital lobotomy has 
been used extensively since 1946, 
there has been a paucity of experi- 

mental research on the effectiveness of this 
operation. And although much interest has 
been manifest in the psychological changes 
following this operation, the findings have 
been inconclusive and obscure. This may be 
in part because of a lack of sensitivity of some 
of our measuring instruments, but another 
reason for these inconclusive results has been 
the lack of rigid experimental controls. 

In many studies psychological tests have 
been administered to a few patients pre- and 
postoperatively. However, the experimenters 
often failed to validate their results by match- 
ing the operative group with a comparable 
group (in terms of age, sex, diagnosis, etc.) 
who were treated alike in every respect except 
that they did not undergo the operation. 

Owing to the relative simplicity and brevity 
of this operation, as well as the need for some 
type of aid for those patients who suffer 
chronic and severe states of mental illness, 
many psychiatric hospitals have utilized this 
operative procedure despite the fact that there 
is little conclusive and objective knowledge 
concerning the results. 

In view of the present general acceptance of 
these surgical measures, more precise investi- 
gations need to be undertaken for three 
purposes: (a) to determine more precisely 
those psychological changes which have oc- 
curred, (6) to furnish an objective basis for 
evaluating the effectiveness of transorbital 
lobotomy, and (c) to enable selection of those 
patients for whom this operative procedure 
seems warranted. 

A review of the literature indicates that some 
of the changes which are felt to result from 
lobotomy are: a decrease in anxiety and ten- 


1 The authors are indebted to Dr. C. G. Holland, 
H. G. Hansen, and many others on the administrative 
staff at Western State Hospital, Staunton, Va., where 
this study was conducted. Grateful acknowledgement 
also is due to Dr. O. A. Trice, Mary Baldwin College, 
for his helpful advice, and to B. Moskowitz, University 
of Oklahoma, for statistical assistance. 


sion (6, p. 658), some loss of insight and the 
ability to introspect (7), some loss of creativ- 
ity (6, p. 417), a decrease of enthusiasm and 
zeal (10), but a better personality integration 
which reflects a total improved personality 
adjustment. This study attempts to validate 
these empirical findings and, in addition, 
attempts to ascertain whether or not patho- 
logical indications of brain damage will be 
evidenced as a result of the transorbital 
lobotomy. 


PROBLEM 


This study has as its over-all purpose the 
task of attempting to denote the personality 
changes which are effected by transorbital 
lobotomy. These changes are assessed by the 
Rorschach test, which is commonly thought 
to delineate the structure of personality. 

Klopfer (9, pp. 279-281) states that m 
(inanimate movement) reflects tensions within 
the personality structure. He adds that an 
individual experiences such promptings from 
within as hostile and uncontrollable forces 
working upon him, rather than as sources of 
energy at his disposal. If the lobotomy has the 
effect of reducing tension, a diminution of this 
scoring determinant, m, would be expected 
after the operation. 

It is generally believed that the lobotomy 
has the effect of lessening anxiety, apprehen- 
sions, and fears. Klopfer states, “Every single 
K (shading as diffusion) and every k (shading 
as three dimensional expanse projected on a 
two dimensional plane) response other than 
‘clouds’ to Card VII is an expression of some 
anxiety” (9, pp. 241-242). Therefore, it is 
hypothesized that fewer K and k determinants 
will be elicited postoperatively. 

It is hypothesized that the lobotomized pa- 
tient loses much of his ability to introspect; 
that is, there is a lessening of his critical self- 
appraisal and insight. Thus, a decrease in the 
measure of introspection and self-awareness, 
FK (shading as three dimensional expanse in 
vista or perspective), would be expected in the 
Rorschach protocols after the operation. 
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If creative imagination is reduced in lobot- 
omized patients, it would be expected that 
this Rorschach determinant, M (human move- 
ment), should decrease. 

Since transorbital lobotomy has the effect 
of destroying cortical tissue, it would be 
expected that this cortical damage would be 
revealed as organic pathology on the Ror- 
schach test. The extent of such pathology is 
evaluated in this study by the use of a check 
list proposed by Dorken and Kral (3). 

If the iobotomized patient has lost his ar- 
dent enthusiasm and active interest to some 
degree, this should be reflected by prolonged 
reaction times to the ten Rorschach cards. 

Finally, if transorbital lobotomy is to effect 
a better level of adjustment, the Rorschach 
test should manifest this improvement. There- 
fore, it would be expected that two quantita- 
tive measures of general adjustment, the 
Munroe check list (12) and the Harrower- 
Erickson check list (8), would reflect this 
improved adjustment. 


PROCEDURE 


Subjects. All the subjects (Ss) in this experiment 
were patients at Western State Hospital, Staunton, 
Virginia. The eight psychotic patients comprising the 
operative group were selected for transorbital lobotomy 
on the basis of two criteria: (a2) They had failed to 
respond to all types of therapy employed at this hos- 
pital. (6) The relatives of these patients had signed 
operative permits. The control group was derived from 
the parent hospital population of approximately 2,500. 
Eight control Ss were selected who matched the ex- 
perimental group on seven criteria (sex, age, education, 
diagnosis, treatment, duration of hospitalization, and 
duration of illness). 

Table 1 shows a comparison of the two groups for 
five of the seven matching criteria. In regard to diag- 
nosis, each group contained six schizophrenics, one 
involutional-paranoid, and one manic-depressive, 
depressed type. All patients in both groups had re- 
ceived extensive electroshock therapy with the excep- 
tion of one matched pair who had received no therapy 
of this type prior to the experiment. 

One month prior to the operation, all Ss of both 
groups were given the Rorschach test. In an attempt to 
control all variables, from this time until one month 
after the operation the “buddy system” was rigidly 
enforced; that is, each patient of the experimental 
group was paired with a patient of the control group, 
and each pair was housed on the same ward, attended 
occupational therapy together daily, participated in 
the same recreational activities, ate at the same time, 
etc. One month from the date of testing, the experi- 
mental group received the operation. For this opera- 
tion it is customary to use electroconvulsive shock to 
induce anesthesia. In similar fashion, the control 
group was given the same number and intensity of 
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TABLE 1 


COMPARISON OF CONTROL AND EXPERIMENTAL GROUPS 
In TERMS OF MATCHING CRITERIA 














MEANS 
Grovup* DvuRaTION OF DURATION OF 
Ace Epucation HOosPItaLizaTION ILLNESS 
Experi- 48 9yrs.9mos. 6 yrs. 12 yrs. 3 mos. 
mental 
Control 45 10 yrs.7 mos. Syrs.imo. 12 yrs. 





*Each group was composed of one male and seven female 
patients. 


electroshock treatments at this time but did not receive 
the operation. One month following the date of opera- 
tion the Rorschach test was readministered to all Ss 
in both groups. The 32 Rorschach protocols were 
scored by the senior author, employing the Klopfer 
and Kelley (9) system of scoring. The identity of each 
protocol was masked and all protocols were scored in 
random order. 


RESULTS 


Cronbach (2) criticizes the use of many of 
the commonly employed statistical techniques 
in Rorschach research because they do not 
control for the inequality of the scale intervals 
and the skewness of the distribution of the 
data. To meet these requirements made 
explicit by Cronbach concerning the inequal- 
ity of points along the continua and the non- 
normal character of our data, a nonparametric 
or distribution-free method of analysis was 
utilized. Moses (11) has recently reviewed and 
evaluated this type of statistic. 

The usual manner of obtaining a distribu- 
tion-free test of significance is to place the 
data in rank order independently on the two 
variables concerned and then to obtain 
the probability with which the smaller sum 
of rankings may appear. However, since the 
patients in this experiment were matched, a 
paired-score difference technique was used 
and the positive and negative differences 
ranked independently. The probability of the 
lowest sum of rankings, either positive or 
negative, was then determined. Wilcoxon (15) 
discusses this particular nonparametric method 
under the title, “Paired Replicates,” and 
provides an abbreviated table of significant 
values (15, p. 14). 

From an inspection of Table 2 it can be 
noted that four of the ten Rorschach factors 
significantly differentiate between the control 
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TABLE 2 
NONPAR IC MPARISONS OF CONTROL AND 
No AMETR Co. te) Cc te) 
EXPERIMENTAL GROUPS ON TEN RORSCHACH 
Factors AND THREE RorscHacu CHECK LISTS 











Rorscuach Lower Rank RorscHacu Lower RANK 
Factors ToTALs Factors TOTALS 

K% + k% —7.00 Rejections —6.50 

FK% —4.50° Responses —11.00 

W% +4. 50° Reaction time +3.00° 

M% +7.50 Check lists 

m7% —1.00°* Harrower-Erickson +13.00 

PY +16.00 Munroe +13.50 

FC% —9.00 Dérken & Kral —6.00 





* Significant at the .05 level. 
** Significant at the .01 level. 


and experimental groups.? The positive and 
negative signs preceding the lower rank total 
indicate an increase or decrease respectively 
of the measured factor for the operative or 
experimental group as contrasted with the 
control group. A plus-minus (+) lower rank 
total denotes a lack of differentiation be- 
tween the groups. Thus, it can be seen that 
there is a significant decrease of m% and FK %, 
and a significant increase in W% and Reaction 
Time in the experimental group after the 
operation. All other factors studied were not 
significant, although there are obvious trends 
in regard to organicity (as measured by the 
Dérken and Kral check list), number of re- 
jections, and K% + k%. 


DISCUSSION 


From these data it can be inferred that 
transorbital lobotomy produces or results in a 
lessening of inner tension, a lessening of 
introspective self-awareness and insight, and a 
loss of ardent enthusiasm and active interest or 
zeal. The significant increase in W% is diffi- 
cult to interpret except as some change in 
apperception, although it may indicate a 
better organizational ability. 

Current literature contains abundant quali- 
tative information concerning the functions 
of the frontal lobes. According to Freeman 


? To save printing costs, two pages of Rorschach raw 
score data for both the experimental group and the 
control group have been deposited with the ADI. 
Order Document No. 4058 from the ADI Auxiliary 
Publications Project, Photoduplication Service, Library 
of Congress, Washington 25, D. C., remitting in ad- 
vance $1.25 for 35 mm. microfilm or $1.25 for 6 by 8 
in. photocopies. Make checks payable to Chief, Photo- 
duplication Service, Library of Congress. 
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and Watts (5), this operation upon the fron- 
tal lobes succeeds because it divorces psychot- 
ic ideas from accompanying emotional com- 
ponents. They state that the psychotic ideas 
usually persist for a while following the opera- 
tion but gradually fade away upon being re- 
leased from their affective charge. They feel 
that this is due to the sectioning of the an- 
terior thalamocortical projections through 
which the affective and emotional charges 
surge to the prefrontal areas to become in- 
tegrated with the intellectual processes of 
foresight, imagination, and consciousness of 
self. 

Freeman and Watts have emphasized the 
point that “. . . emotional tension is the prime 
requisite for success in Prefrontal Lobotomy” 
(6, p. 658). It appears that as yet there is no 
psychiatric diagnosis that causes one to think 
immediately of psychosurgery. Rather, there 
is a constellation of symptoms described by 
Arnot as, “a fixed state of tortured self-con- 
cern’ (1, p. 267). Inner emotional tension 
is implied in this description, and Robinson 
states the following in this regard: “Psycho- 
surgery, then, not only relieves emotional 
tension; it prevents the development of future 
tensions by reducing the individual’s aware- 
ness of his own self-continuity” (14, p. 422). 
It was observed by the authors that following 
the operation, some patients, particularly 
the paranoid schizophrenics, continued to 
manifest delusional ideation. However, these 
formerly painful thoughts and beliefs seemed 
to become less severe in that these patients 
evidenced less tension and emotional concern 
about this ideation. For example, prior to the 
operation one paranoid patient was convinced 
that he was being followed by a “big, black 
nigger” who was intent on killing him. Follow- 
ing the operation, he continued to feel that 
he was being followed; the delusion remained, 
but he no longer appeared to have the tre- 
mendous emotional concern about it. The 
most significant finding in the present study, 
a pronounced decrease of m%, tends to con- 
firm these empirical observations concerning 
a lessening of tension as a result of this opera- 
tive procedure upon the frontal lobes. 

Freeman and Watts have also advanced the 
hypothesis that the frontal lobes are espe- 
cially concerned with foresight and insight 
and that the emotional component associated 
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with these functions is supplied by the thala- 
mus. They feel that when the thalamic 
connections are severed, the functions of 
foresight and insight suffer temporary oblit- 
eration “...and even in the later course of 
recovery are never as completely endowed with 
feeling tone as they were before” (7, pp. 3-4). 
The finding in the present study of a signifi- 
cant decrease of FK% in the experimental 
group lends support to this hypothesis in 
regard to a lessening of introspective self- 
awareness and insight as a result of trans- 
orbital lobotomy. 

The significantly prolonged reaction times 
seen in this study as a result of the operation 
may well be related to what has been postu- 
lated by Landis. He believes that the loboto- 
mized patient lacks vigilance, “... acting as 
though he were sleepy, often making such 
remarks as, ‘I’m too tired to do this’” (10, 
p. 413). Landis further feels that the loboto- 
mized patient has lost his zealousness, his 
ardent enthusiasm, and active interest. 

There is obscure and conflicting evidence 
concerning signs of intracranial pathology 
due to lobotomy as shown by psychological 
tests. In this regard, Freeman states that 
“the failure of mental tests to reveal either 
positive or negative symptoms after removal 
of a considerable portion of both frontal 
lobes means either that the tests heretofore 
developed are not sensitive to the defects, or 
that the powers of compensation on the part 
of the remaining portions of the frontal lobes 
are so great that no measurable defect re- 
mains” (4, p. 102). Although the authors of 
this study utilized a brain damage check 
list which was thought to be more sensitive 
than that proposed by Piotrowski (13), no 
significant increase of organic signs was demon- 
strated in the experimental group as a result 
of the operation. In the same light, the failure 
of both the Munroe and Harrower-Erickson 
check lists to demonstrate any quantifiable 
change in total adjustment in the operative 
group may also be due to the fact that these 
check lists are too insensitive for this par- 
ticular discrimination. 

It is to be noted that, although nonsignifi- 
cant, three of the factors which were studied 
did approach the defined level of significance. 
In the experimental group as contrasted with 
the control group there was an obvious de- 
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crease in the number of rejections and in K% 
+ k%, along with an increase in signs of 
intracranial pathology. With a larger control 
and experimental population it is possible 
that significant evidence of brain damage and 
a significant decrease of emotional blocking 
and anxiety will be more clearly noted. 

In 1946, Freeman stated that “statistics 
are a poor medium by means of which to 
convey the changes that occur in patients 
following Prefrontal Lobotomy” (6, p. 657). 
He felt that the empirical evidence of change 
in some of these patients and their resulting 
transformation into placid, quiet, uncom- 
plaining individuals who showed little con- 
cern about their troubles was justification 
enough for the more drastic operative proce- 
dures. However, the authors of this study are 
of the opinion that only by precise investiga- 
tions can the psychological changes be deter- 
mined. And it can only be from an accurate 
knowledge of these changes that an objective 
basis for the evaluation of the effectiveness 
of the transorbital lobotomy operation can be 
established. Furthermore, we will be better 
able to select those patients for whom this 
operative procedure seems warranted only if 
we have precise and objective knowledge of 
those personality characteristics which are 
most amenable to change due to this type of 


psychosurgery. 
SUMMARY 


This study was designed to investigate the 
personality changes effected by a particular 
type of psychosurgery, transorbital lobotomy. 
The experimental or operative group was com- 
posed of eight hospitalized psychotic patients. 
From the parent hospital population, eight 
control subjects were selected on the basis of 
their close resemblance to the individual mem- 
bers of the experimental group in regard to 
seven criteria by which they were matched. 
The Rorschach test was administered to both 
groups one month prior to, and one month 
following, the date on which the experimental 
group received the operation. During the in- 
terim between pre- and posttesting, every 
effort was made to control environmental 
variables; members of each matched pair 
were treated alike, except that the controls 
did not receive the operation. However, the 
controls did receive electroshock treatments 








comparable in number to those given the ex- 
perimental Ss on the day of operation. 

Four Rorschach factors significantly dif- 
ferentiated between the control and experi- 
mental groups. There was a significant de- 
crease of m% and FK% and a significant 
increase in W% and Reaction Time in the 
experimental group following the operation. 
From these data it can be inferred that transor- 
bital lobotomy results in a lessening of inner 
tension, a lessening of introspective self-aware- 
ness and insight, and a loss of ardent enthusi- 
asm and active interest. The significant in- 
crease in W % is difficult to interpret except as 
some change in apperception. These findings 
and their implications should be considered if 
psychosurgery of this type is planned. 
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“ 


recently been receiving much attention 

(4). According to one of the most 
extensive researches in this area (1), we can, 
for example, type a person as an “authori- 
tarian personality,” and should be able to 
predict how he will behave in nonsocial as 
well as social situations. One of the distinguish- 
ing characteristics of such a person, we find, 
is rigidity. According to Rokeach, “‘the rigidity 
inherent in an ethnocentric person’s solution of 
social problems is not an isolated phenomenon 
within the personality, but is rather an aspect 
of a general rigidity factor which will also 
manifest itself in the solution of any problem, 
be it social or non-social” (18, p. 259). 

Further analysis of the concept of rigidity, 
as it is understood by other investigators, 
reveals that there exists little agreement as to 
the specificity or generality of rigidity of an 
individual (1, 7, 10, 17). The Kounin-Werner 
controversy (14, 21) and, more recently, the 
Luchins-Rokeach controversy (17, 19) em- 
phasize the differences of conceptualization. 
One of the striking deficiencies in this area of 
study has been the lack of agreed upon meas- 
urement techniques; investigators have usually 
developed their tests independently of each 
other. The present investigation was an at- 
tempt to synthesize some of the antecedent 
studies.? 

The study was designed to sample the so- 
called measures of rigidity employed in a 
number of other investigations, singly or in 
various combinations (1, 2, 7, 8, 10, 16, 18), 
to discover empirically what relationships 
might exist among them when all are ad- 
ministered to the same subjects. One of the 
hypotheses being tested, then, was related to 
the existence of a general factor of rigidity, 
since the conclusions of the experimenters 


Pi sent tes evi psychology has 


1 This report is based upon a dissertation submitted 
to the University of Michigan in partial fulfillment of 
the requirements for the degree of Doctor of Philosophy. 
The writer wishes to express her appreciation to Dr. 
E. Lowell Kelly for his guidance and criticism in the 
direction of the dissertation and for reading and criticiz- 
ing this manuscript. 

2 Not all of the data collected will be presented here. 
Additional information can be found elsewhere (3). 
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noted above include such variations as the 
assumption of a general factor of rigidity (18), 
the proposal of an ego level and a peripheral 
level of rigidity (10), and an emphasis on the 
importance of social field conditions (16). 

This investigator was further interested in 
the relationship between age and rigidity in 
normal subjects, with emphasis on possible 
changes in this relationship in old age as a 
function of insecurity rather than of age itself. 
Therefore, the present study was conducted 
under three varying conditions of security. A 
second purpose of this study, then, was to 
investigate empirically how scores on each 
measure, and the relationships among these 
measures, might change under stress, i.e., to 
test for differences under varying conditions 
of security. 

We may, therefore, state our major hypothe- 
ses as follows: (a) There is no general factor 
among a number of so-called measures of 
rigidity under varying conditions of security. 
(6) On any particular measure of rigidity em- 
ployed, there are no significant differences 
between mean scores of equivalent or matched 
groups under varying conditions of security. 

It was expected that the first hypothesis 
would be confirmed, but that the second 
hypothesis might be rejected. 


METHOD 


The subjects (Ss) were 79 candidates for Submarine 
School at the U.S. Naval Submarine Base, New London, 
Connecticut.* The stress condition was a real-life threat. 
All Submarine School candidates at the Base must pass, 
in addition to the physical and psychological require- 
ments, a pressure-chamber and water-tank-escape test. 
Thus, each candidate must demonstrate the ability to 
withstand 50 pounds of atmospheric pressure. If he fails 
this test, he is ineligible for further Submarine School 
training. If he passes, another test involves an “escape” 
from a 100-foot tank of water. Candidates are cleared 
medically before the pressure test, and there is no 
physical reason why any of them should not be able to 
withstand 50 pounds of pressure. Yet candidates do 





*The author wishes to express her thanks to Lt. 
Comdr. Dean Farnsworth, Comdr. Harry J. Alvis, and 
Captain T. L. Willmon for their cooperation in pro- 
viding Ss and facilities at the U. S. Naval Submarine 
Base, New London, and to the personnel at the base 
who served as Ss. 
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TABLE 1 








Means, SD’s, anp F’s ror Ace, Epucation, GCT Scores, AND LENGTH OF SERVICE 











AGE (in yrs.) Epvc. (in yrs.) GCT Serv. (in mos.) 
Group N Mean SD Mean SD Mean SD Mean SD 
Day Before 24 19.17 1.61 11.83* 1.32 59.50 5.76 13.79 16.84 
Day After 26 20.31 3.10 11.32* .88 56.48* 6.13 29.15 34.93 
Week After 29 19.52 2.84 11.50* 2.64 58.93 6.65 17.76 21.14 
Ft 1,19 1.40 1.60 1.77 


* Information not available for one S in this group 


fail at this point. Although no one has ever “failed” 
the tank-escape test, there have been candidates who 
refused to take the test. Therefore, it was believed 
that this total situation represented a threat to the 
candidate which might be considered a threat to his 
security. In support of this assumption, Cook and 
Wherry (9) have demonstrated that undergoing routine 
training at the escape training tank is sufficiently 
stressful to candidates to be reflected in a significant 
increase of 17-ketosteroid output over basal samples, 
the highest increase being during the prestress period. 

In our study, one group was tested the day before 
the stress situation, one group was tested one day 
after, and a third group was tested one week after the 
stress situation. (The N’s were 24, 26, and 29, respec- 
tively.) Each group comprised the entire group of 
available candidates during a particular month for 
three successive months, and may be considered a 
random sample of candidates for the purposes of this 
experiment, since no selection or elimination was em- 
ployed by the experimenter. A comparison of these 
groups for age, education, length of service, and Navy 
General Classification Test scores yields no significant 
differences (see Table 1). 

The battery of tests included six measures of rigidity: 
the group Rorschach (scored for rigidity according to 
Fisher [10]), the California Ethnocentrism Scale in its 
final 20-item questionnaire form (1), the Angyal Per- 
ceptual Test (2), the Luchins arithmetic Einstellung 
problems (16), the Luchins Hidden Words Test (16), 
and a Hidden Objects test (7, 15). Since all of these 
measures have been described in the literature, they 
will not be described further here. All were group ad- 
ministered in a single day with one rest period. The 
exact forms for all of these measures, and instructions 
for their administration and scoring, as well as further 
information on the complete battery utilized, can be 
found elsewhere (3). 


RESULTS AND DISCUSSION 


Kendall’s (12) tau correlation coefficients 
were computed for the scores of every test 
with every other test for each of the three 
groups; the resulting matrices are shown in 
Table 2. An examination of this table reveals 
no evidence of a generalized rigidity. Of the 
45 correlations among the six rigidity measures, 
22 are found to be negative, 21 are positive, 
and two are zero. Furthermore, only three of 
these 45 correlations reach the .05 level of 


t None of the differences was significant. 





significance (we could have expected two by 
chance), and two of these three turn out to 
be negative correlations. There is, with rare 
exception, no consistency between any two 
measures under all three conditions. For 
example, we find the correlations between 
Ethnocentrism and Hidden Objects to be 
—.17, +.15, and +.01. 

In other words, we seem to be dealing with 
a chance distribution of correlation coefficients 
in spite of the fact that we chose measures on 
the basis of their having been used in previous 
investigations as measures of rigidity. There 
are two possible explanations for such findings: 
(a) the measures are not reliable, and un- 
fortunately reliability figures are not reported 
for any of the rigidity measures except the 
Ethnocentrism Scale (estimated at .90) and 
Hidden Objects Test (.55); and (5) the tests 
are all measuring different things. The ab- 
sence of a general factor of rigidity is further 
borne out by the coefficients of concordance 
(W) among these measures (W = .073, .151, 
and .153 for the Week After, Day After, and 
Day Before groups respectively). None of 
the three approaches statistical significance. 

Space does not permit a detailed discussion 
of these results or a comparison of how these 
results support or contradict those of other 
experimenters. However, because of the wide 
audience which has been reached by the work 
of Adorno, Frenkel-Brunswik, Levinson, and 
Sanford (1), and the so-called confirming 
evidence of Rokeach (18), let us look at the 
results for the Ethnocentrism Scale a little 
more closely. According to the investigators 
just named, ethnocentric people can be differ- 
entiated from nonethnocentric people. As 
Bruner has stated it,“ ... the major emphasis is 
upon the representation of certain generalized 
personality processes in different specific 
spheres of mental functioning” (6, p. 122) 
(e.g., rigidity in so-called nonsocial problem- 
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TABLE 2 
Matrices oF CoRRELATION COEFFICIENTS (TAU) FOR 
A Batrery or Srx Ricmity MEASURES AND THE 
Navy GENERAL CLassiFIcaTION Trstt 
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TABLE 3 
A COMPARISON OF MEAN NuMBER OF NONRIGID 
SOLUTIONS ON ARITHMETIC EINSTELLUNG PROBLEMS 
or Hicus AnD Lows ON ETHNOCENTRISM 











Task Rom. Erm. Agr Wns. Osj. Anc. GCT 
+.16 +.064 +.05 +.07 —.05 —.14 

Ror. x —.02 —.09 0 =—.15 +.04 —.20 
—.14 —.17 +.08 +.08 —.05 —.12 

+.185 +.23 —.17 +.02 —.25 
Eth x —.34* +.31° +.01 —.05 —.40°° 
+.21 +.05 +.15 —.10 —.32° 

+.20 —.39° —.05 +.06 

Arith. x —.2%6 —.18 0 +.33 
+.03 +.16 +.16 —.29 

—.11 —.07 —.03 

Wds. x —.08 +.09 +.03 
—.277 —.25 +.34° 

—.19 +.01 

Obj. x +.28 —.03 
—.03 —.27 

Ang. —.% 
x +.05 

+.07 

GCT x 


* Significant beyond the .05 level of confidence. 

** Significant beyond the .01 level of confidence. 

t Top row = week after stress; middle row = day after; 
bottom row = day before. 





solving situations). Our results do not support 
such a simplified typology. 

The only consistent positive correlations 
under all three conditions are between the 
Ethnocentrism Scale and the Hidden Words 
Test, and even these become negative, inci- 
dentally, if we interpret the Hidden Words 
Test in the manner suggested by Cattell and 
Tiner (7). Even if we divide our groups into 
“Highs” and “Lows” on the Ethnocentrism 
Scale, and compare their respective solutions 
on the Luchins arithmetic Einstellung 
problems for rigidity of solution, in a manner 
similar to that employed by Rokeach, we find 
contradictory results (see Table 3). 

The results for the Day Before group sup- 
port the Rokeach study in that we find the 
Highs solve fewer problems by the short 
method than the Lows, and this difference is 
statistically significant (p = .02). In the 
Week After group, we find that both Highs 
and Lows solve about the same number of 
problems in the nonrigid way. The really con- 
tradictory results, however, are those pro- 
duced by the Day After group, where the Highs 
produce significantly more nonrigid solutions 
than the Lows, a direct reversal of Rokeach’s 
findings (p = .02). 
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STANDARD 
Mzans DrviaTIons 

wo n 

= rs 

& S 
Grovur N s WN Lows ‘ ? = Lows 
Day Before 12. .80 12 2.17 2.74 -02 87 1.82 
Week After 14 «680 18.67 50 NS. 90 .87 
1.08 13 .15 2.66 -02 1.14 .3%6 


Day After 13 





Brown (5) also found that the correlations 
between the F scale and several variations of 
the arithmetic Einstellung problems varied 
according to the conditions of administration, 
and were apparently a function of the degree 
of ego involvement. It is suggested that 
Rokeach’s results were obtained under ego- 
involving rather than neutral conditions. 

From an analysis of the findings so far re- 
ported, it would appear that we are left es- 
sentially with the rather disturbing picture of 
low correlations among our measures, which 
appear to behave erratically under different 
conditions of motivation. We conclude, there- 
fore, that there is no general factor among a 
number of so-called measures of rigidity under 
varying conditions of security. Let us now 
turn to a consideration of this “erratic” be- 
havior. 

As was pointed out above, we would seem 
well advised to be cautious about accepting a 
simple personality-centered theory which 
seems to leave little room for variations in 
field conditions. Harris (11) reports that 
creating a “test” situation increased rigidity. 
Luchins (17) has stressed the importance of 
field conditions in creating or overcoming the 
Einstellung effect. Although he attempted to 
exclude insecurity from his study, Kounin 
(13) does mention it, and other motivational 
factors, as causes of rigidity. 

We find that the intergroup comparisons in 
this study can be quite meaningfully organized 
within the framework of some speculations 
about the different motivations within each 
condition. Briefly, as can be seen in Table 4, 
there are some statistically significant mean 
differences among the groups, although the 
direction of these differences does not appear 
to be consistent. Let us first compare the more 
extreme conditions: the Day Before and the 
Week After groups. 

The security-stress condition utilized in this 
study was an acute threat, physical in nature 
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TABLE 4 


Means, SD’s, anp TESTS OF SIGNIFICANCE OF DiF- 
FERENCES FOR Six RIGIDITY MEASURES 





Groups 
Task Grovret N Mean SD Comparep F 
WA 2 22.79 10.28 DB-WA +2.02° 2.62°° 
Ror. DA 2% 33.06 15.78 DB-DA — .51 1.11 
DB 24 30.58 16.65 WA-DA —2.84°° 2.36° 
WA 2 86.48 18.56 DB-WA —2.23* 1.22 
Eth. DA 26 85.92 19.27 DB-DA —2.03* 1.13 
DB 24 74.25 20.52 WA-DA + .ll 1.08 
WA 2 59 .89 DB-WA +2.04° 3.44°° 
Ari.tt DA 26 2 9% DB-DA +1.84 2.95° 
DB 2% 1.33 1.65 WA-DA — .12 1.16 
WA 2 4.14 2.24 DB-WA — .91 1.28 
Wds. DA 2% 3.35 2.44 DB-DA + .35 1.52 
DB 24 3.58 1.98 WA-DA +1.23 2.19° 
WA 2 9.69 2.80 DB-WA +1.43 2.19° 
Obj.tt DA 2% 10.23 2.72 DB-DA + .65 2.07° 
DB 24 10.67 1.89 WA-DA — .71 46 
WA 29 3.83 2.64 DB-WA +1.72 1.06 
Ang. DA 2 5.19 2.66 DB-DA — .08 1.06 
DB 2% 5.13 2.72 


WA-DA —1.86 1.02 





* Significant beyond the .0S level of confidence. 

** Significant beyond the .01 level of confidence. 

+t WA = week after stress; DA = day after; DB = day 
before. 

tt Low score = high rigidity on these tests; on all others, 
high score = high rigidity. 


but with psychological implications. One may 
hypothesize that the candidate is attempting 
to “prove himself.” He is self-oriented, since 
““pass”’ or “‘fail’’ in the pressure chamber and 
tank are not related to anyone’s ability but 
his own. It is not a competitive situation. 
Therefore, those tasks in this battery which 
are viewed as tasks in which he might in- 
advertently reveal something of himself should 
be most threatening to a subject under the 
security-stress conditions of the Day Before 
group. 

There are no “‘reality checks” on the answers 
he gives to the Rorschach or the Angyal tests. 
Therefore, those tests should be perceived as 
most threatening, and should be expected to 
reveal greater rigidity in the Day Before group. 
We would assume that this group is more wary, 
is looking for hidden meanings. In that event, 
on both of these tasks, where they have no 
way of knowing if they are “right,” they would 
be most guarded in their approach. We find 
that this is true; i.e., the Day Before group 
is more rigid on these two measures. 

It can be seen, then, that on some of the 
other tasks, this same kind of approach might 
lead to behavior which could make the Day 
Before group appear more flexible than the 
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Week After group. Thus, on those tasks where 
hidden meanings can be found, we can expect 
the Day Before group to find more than the 
Week After group. This means that we might 
expect them to score less rigid on Hidden 
Words, Hidden Objects, and arithmetic 
Einstellung problems. This turns out to be 
the case, although several of these differences 
do not achieve statistical significance. 

What about the scores on the Ethnocentrism 
Scale? The Day Before group does not yet 
know if it is an “in-group” or an “out-group” 
in terms of Submarine School. Furthermore, it 
has been shown by Stouffer et a/ (20) that in a 
“threat to life’ situation prejudice decreases. 
We would therefore expect the Day Before 
group to be less ethnocentric than the Week 
After group. The mean difference is in the 
expected direction and is statistically signifi- 
cant. 

An analysis of the results of the Week After 
group can be explained similarly. These candi- 
dates know they have been accepted for school, 
but training has not yet started. There is no 
immediate threat for them. We can expect 
that they can return to the cu!tural norm and 
view the experimental situation in a competi- 
tive frame of reference. Therefore, they would 
be most threatened by those tasks which do 
have a reality check, i.e., those on which they 
know if they are “right” or “‘wrong” when they 
write down an answer. For these people, then, 
the Hidden Words, Hidden Objects, and arith- 
metic Einstellung problems are most threaten- 
ing. It is even conceivable that the last, which 
is most like a school-type problem, is the most 
threatening. Thus it is not surprising that the 
Week After group scores significantly more 
rigid on the arithmetic problems, and more 
rigid (although not significantly) on Hidden 
Words and Hidden Objects. 

The Day After group are the victims of 
conflicting motivating states. They have taken 
the tank and pressure tests, but do not yet 
know if they have been admitted to Submarine 
School, since candidacy can be refused even if 
these tests are passed. The results similarly 
indicate this conflict. We find that this group 
tends to score high on all of the rigidity meas- 
ures, but the coefficient of concordance is not 
higher than those for the other two groups. 
Thus it would appear that there are different 
kinds of people within the group who are 
affected differentially by the conflict of mo- 
tives. In this situation, it is conceivable that 
the most important determinant of the rigidity 
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scores will be the kind of individual, i.e., one 
who assumes he is “‘in,”’ assumes he is “‘out,”’ 
or vacillates in his belief. 

The above findings would seem to suggest 
that there are at least three major components 
which contribute to variations in the manifest 
behavior which we label “‘rigid.”” These are 
the individual, the nature of the task, and the 
general conditions under which the tasks are 
administered. Rather than asking “‘is there a 
general factor of rigidity?” we might better 
ask “‘under what conditions are we likely to 
find generalized rigidity?” 


SUMMARY 


The tests of rigidity included in this study 
have all been used in at least one other investi- 
gation-as measures of rigidity. Three equivalent 
groups of Ss were tested the day before, one 
day after, and one week after a real-life stress 
situation which was presumed to arouse feel- 
ings of insecurity. Tau correlation coefficients 
were computed for the scores of every test 
with every other test, and appeared to yield 
essentially a chance distribution. Furthermore, 
coefficients of concordance among these meas- 
ures did not approach statistical significance. 
Therefore, it was concluded that there is no 
general factor among a number of so-called 
measures of rigidity under varying conditions 
of security. 

A comparison of mean differences among the 
groups on the various measures of rigidity 
revealed that there were some statistically 
significant differences, but the direction of 
these differences was not consistent. The 
hypothesis that, on any particular measure of 
rigidity employed, there are no significant 
differences between mean scores of equivalent 
groups under varying conditions of security 
could not be conclusively rejected. An ex- 
planation of these results was offered, and 
evidence in support of this explanation pre- 
sented. 

The following conclusions appear to be 
justified: (¢) There is no general factor of 
rigidity among a number of so-called measures 
of rigidity; the interrelationships of these 
measures appear to vary with the nature of 
the tests employed and the conditions of test 
administration, as well as behavioral de- 
terminants within S(s). (6) Scores obtained 
by an individual on any so-called measure of 
rigidity appear to be a function not only of 
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the individual, but also of the nature of the 
test and the conditions of test administration. 
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PHYSIOLOGICAL NEED, VERBAL FREQUENCY, AND WORD 
ASSOCIATION! 
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REVIOUS experiments in which the in- 

structions have been to “associate” 

while under the influence of various 
periods of food and water abstinence have 
employed such diverse methods and depriva- 
tion periods as to make comparison of the find- 
ings exceedingly difficult. Nevertheless, es- 
pecially since it is not available elsewhere, a 
perusal of the literature may prove valuable 
to see whether, despite differences in methods, 
certain conclusions can be synthesized to 
formulate hypotheses for further investigation. 


PROBLEMS 


1. The question of the exact strength of the 
physiologically induced deprivation has clearly 
been ignored by earlier research. Deprivation 
periods, both in the experimental and the con- 
trol groups, have been unsystematically varied 
from one to 24 hours. In other needs this would 
be less crucial than in hunger, which in our 
culture is highly cyclical. For example, in 
Sanford’s work (9, 10) the 27 experimental 
subjects (Ss) were given a battery of tests 
(including words for association, pictures for 
interpretation and completion, and various 
subjective rating scales) after a 24-hour fast, 
while the 35 control Ss were tested at various 
times during the normal eating cycle. In the 
Chein, Levine, and Murphy experiment (3) 
Ss were told to verbalize an association to 40 
colored and 40 achromatic cards (containing 
ambiguous drawings and drawings of mis- 
cellaneous household articles); the five ex- 
perimental. Ss were tested once a week with 
one, three, six, and nine hours of abstinence 
from food, while the five control Ss were tested 
from 45 minutes to two and one-half hours 
after eating. The results of these studies will 
be analyzed later. Our present object is to 
suggest that these experiments carried the 


1 This research was supported by a grant from The 
Ohio State University Research Foundation. The 
writer wishes to thank Ferdinand van der Veen and 
Melvin Lerner for assistance in collecting and proc- 
essing the data. Dr. Julian Rotter graciously read 
this paper, and made many helpful suggestions. 


deprivation periods neither far enough nor 
systematically enough. This applies as well 
to the McClelland and Atkinson project (1, 6) 
to be discussed below. Therefore, the first 
purpose of this research is to induce a pro- 
tracted period of total abstinence from food 
and water, and to test during the course of the 
deprivation the relationship between the 
physiologically induced need and the word 
associations to need-relevant and neutral 
stimuli. 

2. McClelland and Atkinson (1, 6) tested 
their Ss 1, 4, and 16 hours after eating. In 
one experiment, 12 blanks were flashed upon 
the screen with various hints given: naming 
food objects, places related to eating, pro- 
jected feeling responses; and three were shown 
with no comment. In another aspect of the 
same study 40 Ss were asked to make size and 
number comparisons of food-related and non- 
food-related objects during three deprivation 
periods. In a separate experiment, using the 
same Ss and procedures, protocols were col- 
lected on eight TAT-type pictures, represent- 
ing all aspects of the food-getting process. 
Responses were categorized to cover the whole 
gamut of food acquisition, as “goal objects” 
(food), “instrumental responses”’ (knife, fork), 
“food imagery,” etc. The results showed that 
although the total number of food responses 
increased reliably in both experiments with 
hours of deprivation, this was due primarily 
to reliable increases in the instrumental re- 
sponses alone. The size and number estimates 
of food-related objects also increased signifi- 
cantly. The object responses, however, did 
not increase reliably with increased depriva- 
tion, nor did the food-related imagery in the 
TAT-type test. One of the general conclusions 
following from McClelland and Atkinson’s 
results is that the responses made under in- 
duced needs have an importance for the study 
of the cognitive processes beyond their sheer 
number. Their analyses have shown that the 
gross number of food responses is concealing 
important information. The second purpose 
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of this research, therefore, is to extend and 
confirm the response categorization of Mc- 
Clelland and Atkinson, especially to sharpen 
the definitions of the categories, and to include 
classifications for the affective responses that 
characterize the verbal expressions of intense 
needs. 

3. The work of Solomon and Howes (4, 12) 
with recognition times has demonstrated the 
importance of the commonness of the stimulus 
material. Their findings suggest that in study- 
ing the associative process, cognizance must 
be taken of the strength—in Thorndikian 
terms—of the stimulus material. The final 
purpose of this research is, thus, to determine 
in what way the commonness of the stimulus 
word will affect the number and kinds of 
association responses. 


METHOD 


The stimulus material. Forty-eight stimulus words 
were used, of which half were need-relevant and half 
were neutral (supposed to have no connotations of 
food and/or water deprivation). Of the 24 “need” 
words, 12 were related to food deprivation and 12 to 
water deprivation. Following the work of McClelland 
and his colleagues (1, 6), the need words were divided 
into “act” words (those denoting the taking of food 
and/or water into the body) and “object” words (the 
names of satisfiers, etc.). Thus there were four sub- 
classes of need words: (a) food-act words, (6) food- 
object words, (c) water-act words, and (d) water-object 
words; and a class of neutral words which were matched 
with the need words for commonness. The stimulus 
words were randomized in two lists of 24 words each, 
each list containing equal representation from the four 
subdivisions of the need words, and the same number 
of neutral words. 
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The commonness of the stimulus words was assessed 
as follows: Correlations were computed between the 
Thorndike-Lorge (13) frequency ratings for the stimulus 
words and the estimates of two independent random 
samples (N’s of 100 and 150) of college undergraduates 
who were asked to rate the stimulus words as “‘com- 
mon,” “mid-common,” or “uncommon,” depending 
upon their frequency of usage in the daily vocabulary 
of the judges. Since these correlations ranged between 
.70 and .97, the undergraduate judgments were used 
in the final commonness ratings. Table 1 gives the 
stimulus words, their order of presentation, ratings, 
and classifications. 

Subjects. The Ss used in the word association study, 
men and women students at The Ohio State University, 
were 50 of the 60 Ss who had just served in the recog- 
nition-time experiment described elsewhere (14). None 
had completed an introductory course in psychology. 
Although the same words were used in both the word 
association and the recognition-time study, upon inter- 
rogation none of the subjects reported awareness of 
this fact. 

Defrivation conditions. On the testing days Ss were 
not permitted to eat or drink, and they were requested 
to limit their activities to classes and study. Smoking 
was permitted with restraint. Three levels of depriva- 
tion were induced: 0-2 hours (the control group), 
10 hours, and 24 hours. The 0-hour group ate a normal 
but late lunch; the 10-hour group ate breakfast at 
7:30 a.m.; and the 24-hour group had their last meal 
at dinner time on the night preceding their testing 
period. All testing started at 5:30 p.m. The Ss were 
randomly assigned to one of the three deprivation 
periods, and one of the two lists. 

Procedure. The study was presented as one of dep- 
rivation and verbal fluency. The Ss were instructed 
“to give the first words which came to mind,” and to 
continue associating “until the experimenter said 
‘Stop.’ The experimenter (£) then read the stimulus 
words to S. Seated at a table about six feet behind and 
to the left of S, E recorded the first 19 responses. Some- 
where between the nineteenth and twenty-fifth re- 
sponse, E stopped S, and proceeded to the next word. 


TABLE 1 


PRESENTATION ORDER AND 








drink (ACW) 
coke (OCW) 


munch (AUF)* 
godson (NU-) 


need (NC-) dine (ACF) 
mean (NC-) waddle (NU-) 
serenade (NU-) waffle (OUF) 
feed (ACF) chuckle (NU-) 


lemonade (OUW) 
think (NC-) 
comb (NC-) 


mine (NC-) 
hunch (NU-) 
chocolate (OUF) 


imbibe (AUW) guzzle (AUW) 
milk (OCW) dive (NC-) 
meat (OCF) 


imbed (NU-) 


CLASSIFICATIONS OF THE Two Lists oF StimuLus Worps 
List II 

nibble (AUF) sip (ACW) 
devout (NU-) water (OCW) 
beat (NC-) drain (ACW) 
speak (NC-) idler (NU-) 
solar (NU-) cider (OUW) 
eat (ACF) ram (NU-) 
soda (OUW) lake (NC-) 
rip (NC-) quibble (NU-) 
weather (NC-) ham (OUF) 
gulp (AUW) devour (AUF) 
cake (OCF) dread (NC-) 
gust (NU-) steak (OCF) 





* The classification of the words is: A, act; O, object; N, neutral; C, common; U, uncommon; F, food; W, water. 
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RESULTS 


Our first concern is the effect of the stim- 
ulus words upon the word association re- 
sponses. The responses may be categorized as 
“food-related” (meat, bread, cake, etc.), 
“water-related” (water, pitcher, beer, etc.), 
and “neutral responses’ (those responses 
supposedly unrelated to food and water dep- 
rivation). The chi-square analysis of the 
data presented in Fig. 1 is significant beyond 
the 1% level. The food-related stimuli beget 
more food-related responses, the water- 
related words elicit more water association, 
and the neutral stimuli get more neutral 
word responses. Thus there is a congruity be- 
tween the nature of the stimulus words and 
the kinds of responses made to them. 

The design of the experiment permitted the 
investigation of the hypotheses, derived from 
related research (1, 6, 12), that the “act- 
object” and “common-uncommon”’ classifica- 
tions of the stimulus words would be related 
to the number and kinds of responses made. 
The act-object stimuli prove to be nearly 
significantly related to the number of food, 
water, and neutral responses (chi square = 
5.70; a chi square of 5.99 is significant at the 





‘1G. 1. NuMBER OF Foop, WATER, AND NEUTRAL 
ASSOCIATIONS TO Foop, WATER, AND 
NEUTRAL Strmuutus Worps 
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5% level); the object words received more 
food responses and the act words elicited more 
water associations. There was also reason to 
believe that more responses would be made 
to the need-related common word stimuli, 
but the chi-square analysis fails to confirm 
this. We can summarize our findings thus far 
by noting that although neither the act-object 
analysis of the stimulus words nor the com- 
mon-uncommon one is significant, the food- 
water-neutral classification proves to be an 
important factor. The implications of these 
findings for an understanding of the asso- 
ciative process is clear; even in the presence of 
strong deprivation, certain categories of the 
stimulus words are significant determiners of 
the associations made. 

Our second major interest is the relation- 
ship between the need-related responses and 
the deprivation periods. The food- and water- 
related responses were categorized for each 
of the deprivation periods. This chi square is 
significant beyond the 1% level, indicating 
that there is a significant relationship between 
the number of food and water word associa- 
tions and the hours of deprivation. More 
food- and water-related responses are given 
at 10 hours of deprivation than at zero hours 
(the control period). However, at the 24-hour 
deprivation period there is a slight decrease 
in the number of food and water responses. 
Our results suggest that the relationship of 
need-relevant associations and a physiological 
need is of the general curvilinear type. 

Our third interest is with the nature and 
the number of the word association responses 
and the periods of deprivation. The responses 
can be further refined into any one of the five 
categories listed below: 


1. Act responses—verbs and verb forms (gerunds) 
directly denoting taking food and/or liquids into the 
body, as eat, eating, masticate, drain, etc. 

2. Object responses—names of hunger and thirst 
satisfiers, as the names of foods in their edible state 
and the names of meals. 

3. Instrumental responses—names of objects and 
processes instrumental to need satisfaction which, 
however, are not themselves satisfiers, as (a) the names 
of specific establishments for obtaining food and/or 
drink (restaurant, bar, etc.); (6) the names of all utensils 
necessary for preparing, containing, and consuming 
food and/or beverages in our culture (pots, plates, 
glass, etc.); (c) the names of foods which in our culture 
are not edible in their raw form (grain, hogs, etc.); (d) 
the names of all processes for the preparation of food 
(cooking, baking, etc.). 
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TABLE 2 


THe Noumsper or Act, Osyect, INSTRUMENTAL, 
AFFECTIVE, AND NEUTRAL WorpD ASSOCIATION 
RESPONSES MADE AT Eacu DEPRIVATION PERIOD 








Derai- 





instrvu- Arrec- 
vation Act Osyject —-—, ae Nevutrat ToTaL 
Prxiops 
0 132 748 297 112 §,551 6,840 
10 186 1,075 192 197 5,190 6,840 
24 143 690 299 81 562 6,840 
16, 362 


Total 461 


2,519 788 390 


4. Affective responses—all those adjectival responses 
referring more or less exclusively to hunger and thirst 
satisfiers, as sweet, sour, delicious, etc. 

5. Neutral responses—all responses which could 
not be coded into one of the above categories, and 
which were taken therefore as generally unrelated to 
the hunger-thirst process. 


Table 2 gives the number of responses in 
each of the five categories for each of the 
deprivation periods. This information is 
presented graphically in Fig. 2. In their raw 
form these data failed to satisfy the assump- 
tion of homogeneity for the analysis of vari- 
ance, so a logarithmic transformation was 
necessary (11). The results of the analysis of 
variance of the transformed data are given 
in Table 3. The principal hypothesis, that the 
deprivation will selectively influence the 
number and kinds of word association re- 
sponses, is tested by the interaction, which 
is significant beyond the 1% level. Inspection 
of Fig. 2 shows that at the 10-hour deprivation 
period there is an increase in the number of 
act, object, and affective responses, but that 
these responses decrease by the 24-hour period. 
The instrumental associations, on the other 
hand, decrease at the 10-hour period, but 
increase by the 24-hour deprivation period. 
Tests for the significance of the differences (#) 
indicate that the differences among the mean 
number of instrumental responses at 0, 10, 
and 24 hours are significant beyond the 5% 
level. This is also true for the affective re- 
sponses, but not for the act or object classi- 
fications. Furthermore, the mean number of 
instrumental responses is significantly dif- 
ferent (beyond the 5% level) from the mean 
number of act, object, and affective associa- 
tions at both the 0- and the 24-hour depriva- 
tion periods. 

In the analysis of variance (Table 3) the 
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TABLE 3 
ANALYSIS OF VARIANCE OF THE LOGARITHMIC 
TRANSFORMATION OF THE WorRD 
ASSOCIATION DATA 











Source af VARIANCE PF - 

Response categories 3 6.63401463 19.0698 <.01 

(act, object, instrumen- 

tal, and affective) 
Deprivation periods 2 . 11091195 1.28 
Interaction 6 . 347879866 4.0346 <.01 
Within cells 168 - 0862234285 

Total 179 . 2055695955 








differences among the response-category means 
are significant beyond the 1% level. Perusal 
of Fig. 2 shows that there are more object 
associations made at all hours. Both these 
results are probably due to the fact that our 
language has more words for hunger and thirst 
satisfiers than for acts instrumental to the 
satisfaction of hunger and thirst. 

Our findings indicate that the classification 
of instrumental associations operates dif- 
ferently with increasing deprivation from 
the act, object, or affective response cate- 
gories, and that object associations decrease 
and instrumental responses increase with in- 
creased deprivation periods. 


DISCUSSION 


What general conclusions can be reached 
about the effects of abstinence from food and 
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water upon the associative process? Despite 
the difficulties inherent in generalizing from 
the results of different experiments, three 
aspects of the relationship between physio- 
logically induced deprivation and word asso- 
ciation should be indicated. 

1. In the first place, the number of food and 
water responses, generically conceived, does 
not increase linearly with hours of deprivation. 
In general, those studies which have apposited 
physiological deprivation and tests of projec- 
tion or word association have found a signifi- 
cant increase in the number of need-related 
responses up to a certain point, but beyond 
this point the need responses decrease. 

For a precise description of the relationship 
between the associative process and depriva- 
tion it is necessary to know the exact need 
strength at which the curve for need responses 
begins to decelerate. This information can be 
obtained only by the construction of a family 
of curves derived from systematically spaced 
testing df the deprivation periods using com- 
parable material throughout. Although only 
in this way will we be able to discuss the 
function of association on deprivation, some 
hypotheses may be derived by considering the 
findings of previous research in this area. In 
the Sanford study (10) the results of the con- 
trols, who were tested from one to four hours 
after eating, showed that the number of food 
responses continued to increase up to four 
hours but that thereafter they leveled off. 
In the McClelland and Atkinson research (6) 
the mean number of instrumental responses 
increased markedly from the one-hour to the 
four-hour deprivation periods, but between 
the four-hour and the 16-hour deprivation 
periods the curves were negatively accelerated 
and the differences were not significant. Chein, 
Levine, and Murphy (3) found that there was 
a decrease in the number of food responses 
occurring between the six- and nine-hour 
deprivation periods, and this occurred some- 
time after ten hours in the present study. 
Juxtaposing these findings, we can derive the 
following relationship between deprivation 
and the gross number of need-related associa- 
tions: The number of need-word associations 
increases as a function of deprivation up to 
about four hours, although this may be largely 
the result of a “food habit” rather than real 
tissue need. The need responses maintain 
themselves at this increased level until about 
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the ten-hour deprivation period. Sometime 
after nine or ten hours of deprivation the 
number of need-related responses decreases. 
Thus our hypothesized function of need- 
related association on physiological deprivation 
is one in which the curve for the need re- 
sponses accelerates rapidly at the outset, 
becomes asymptotic, and finally decelerates 
with the protracted periods of experimentally 
induced deprivation. Probably the relation- 
ship between physiologically induced depriva- 
tion states and need-related responses, when 
it is completely written, will be curvilinear, as 
in the other studies of motivation and per- 
formance. 

2. The more recent experiments have shown 
that important information may be obtained 
from a finer analysis of the need-relevant 
responses. For example, McClelland and At- 
kinson’s categorization into instrumental and 
goal responses proved helpful in reconciling 
some of their findings with those of Sanford 
(10), for Sanford was recording only goal 
responses, which in both studies failed to 
increase significantly (6, p. 220). Although it 
is difficult to compare the results of the present 
experiment with those of McClelland and 
Atkinson (6) (especially since they tested 
in a group situation, which tends to depress 
egocentric responses) both studies show that 
as deprivation increases, Ss make more re- 
sponses which are concerned with acts instru- 
mental to need-satisfaction, while the number 
of names of need satisfiers decreases.? 

3. In order to account for the rise and fall 
of the need-related associations, especially 
goal responses, all the writers in this area have 
had to conceptualize two antagonistic proc- 
esses; the one, call it “wish fulfillment,” or 
“autism” (8), or “‘vivification” (2), accounts 
for initial rise in the curve, while the other, 
call it “Drang nach Realitaét” (6) or “realism” 
(8), attempts to explain the subsequent de- 
cline in the number of need responses. Thus, 
regardless of one’s terminological preferences, 
the empirical finding remains that some place 
between the 10-hour and the 24-hour depriva- 
tion periods both the number and the kinds of 
responses take an important shift. Behind 
each of the various ways of conceptualizing 


* MacLeod’s excellent phenomenologically oriented 
discussion of the motivation and perception problem 
(7) suggests an interesting interpretation of this em- 
pirical result. 
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this fact is the suggestion that this shift is 
contingent upon the ungratifying nature of 
the responses to the subjects. 


SUMMARY 


Fifty college men and women were deprived 
of food and water for 0, 10, and 24 hours, and 
were presented with a word association list of 
24 words which had been matched for com- 
monness and need-relevance. Each S was 
tested only once. The results show that (a) 
more food, water, and neutral word associa- 
tion responses were made to food, water, and 
neutral stimulus words, respectively; (6) 
there was an increase in the number of food 
and water responses up to the tenth hour, but 
a decrease thereafter; and (c) with protracted 
periods of deprivation the number of responses 
pertaining to acts instrumental to need satis- 
faction increased while the number of re- 
sponses involving the names of need satisfiers 
decreased. The implications of these findings 
for a generalized statement of the function 
of association on physiological deprivation 
are discussed, with the hypothesis that the 
relationship is a curvilinear one. 
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AGGRESSION 


PAUL H. MUSSEN anp H. KELLY NAYLOR 
The Ohio State University 


tween fantasy and overt behavior have 

yielded varied results (5, 7, 8). Sanford 
et al. (7) found that ratings of some fantasy 
needs derived from the TAT were positively 
correlated with ratings of overt behavior mani- 
festing these needs, while in other cases the 
strength of the needs expressed in fantasy 
was negatively correlated with the degree of 
overt expression. Correlations between TAT 
fantasy needs and final staff ratings on overt 
behavior ranged from +.41 to —.44, with an 
average of +.11. Murray (5) also found dif- 
ferences among the variables in the extent to 
which overt and fantasy expressions cor- 
responded. In a group of college men, the two 
forms of expression were positively correlated 
on variables such as abasement, creation, 
dominance, exposition, nurturance, passivity, 
rejection, and dejection; but there was a nega- 
tive correlation between fantasy and overt 
sex needs. 

From these studies it may be concluded that 
some needs reflected in the TAT are also 
apparent in overt expression, while other 
needs which are revealed strongly in the TAT 
are seldom demonstrated in overt behavior. 
As Lindzey has pointed out, one of the major 
problems involved in interpreting the TAT is 
the “determination of the conditions under 
which inferences based on the projective 
material directly relate to overt behavior and 
the conditions for the reverse” (4, p. 18). 
With respect to aggression specifically, Mur- 
ray (5) found no correlation between the in- 
tensity of the need in fantasy and its overt 
expression. On the other hand, Sanford e¢ al. 
(7) found that aggressive needs were among 
those which occurred frequently in the TAT 
stories of adolescent subjects, but, according 
to teachers’ reports, were infrequently ex- 
pressed in the overt behavior of this group. 
The correlation between ratings of the sub- 
jects’ TAT aggressive needs and final staff 
ratings of the degree of their manifestation 
of this need was +.15. 


[De of the relationship be- 


In explaining their findings, Sanford ef al. 
(7) suggest that certain antisocial needs such 
as aggression may appear in the TAT stories 
but not overtly because cultural prohibition 
or internal conflict prevents the overt gratifi- 
cation of these needs and thereby increases 
their intensity in the individual’s fantasies. 
According to these writers, needs which were 
frequently present in both fantasy and overt 
behavior were those which are encouraged 
by the culture, but, generally speaking, the 
individual does not have sufficient oppor- 
tunity for their satisfaction. 

In the middle class from which Sanford’s 
and Murray’s subject populations were 
drawn, there are strong punishments for the 
expression of aggression. However, in lower- 
class culture, aggressive behavior is not 
punished but is encouraged (1). Although the 
manifestation of aggressive needs is accept- 
able, different individuals within the group 
have, as a result of their particular back- 
grounds and experiences, different strengths 
of these needs. If Sanford’s hypotheses are 
correct, it can be predicted that, in a lower- 
class population, those who have intense 
fantasy aggression needs will express these 
needs in their overt behavior. 

The first hypothesis of the present study 
is based on Sanford’s suggestions but is 
phrased in more quantitative terms. Specifi- 
cally it states that, in a lower-class group, 
individuals who give evidence of a great deal 
of fantasy aggression will also manifest more 
overt aggression than those who show little 
aggression in their fantasies. 

Sanford’s statements about the withholding 
of the overt expression of aggressive impulses 
are consistent with the theories of Dollard 
et al. (2), the Yale frustration-aggression 
theorists, who maintain that “the strength 
of inhibition of any act of aggression varies 
positively with the amount of punishment 
anticipated to be a consequence of that act” 
(2, p. 33). Although, generally speaking, the 
inhibition of aggressive expression is not an 
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important part of lower-class mores, many in- 
dividuals of this class have experienced pun- 
ishment following the expression of aggression 
and consequently have learned to withhold 
this expression. In short, although there is no 
cultural prohibition against the expression of 
aggression, some lower-class people have 
“internal conflict” about it, ie., they antici- 
pate punishment for aggressive behavior. 

The second hypothesis of the present study, 
derived from the frustration-aggression hy- 
pothesis, states that individuals who have 
strong fears of punishment relative to their 
aggressive impulses will manifest less overt 
aggression than individuals whose fear of 
punishment, relative to their aggressive needs, 
is small. In testing this hypothesis, the amount 
of punishment anticipated and the strength 
of aggressive needs were expressed as a 
fraction referred to as the punishment- 
aggression (P/A) ratio. The second hypothe- 
sis, phrased in terms of this ratio, states that 
individuals with high P/A ratios will show 
less overt aggression than those having low 
P/A ratios. 

If the first two hypotheses are supported 
or even partially supported, they may be 
combined into a third hypothesis which could 
also be systematically checked. If lower-class 
individuals with great fantasy aggressive 
needs express more overt aggression than those 
who have fewer fantasy aggressions (Hy- 
pothesis 1), and if individuals with low P/A 
ratios are more overtly aggressive than those 
with high P/A ratios (Hypothesis 2), then it 
follows that: among lower-class individuals, 
those who have high aggressive needs together 
with a low P/A ratio will manifest more 
overt aggressive behavior than those who have 
few fantasy aggressive needs together with a 
high P/A ratio (Hypothesis 3). 


METHOD 


Twenty lower-class white boys and nine lower-class 
Negro boys at the Bureau of Juvenile Research in 
Columbus, Ohio, served as subjects (Ss) in this study. 
Their ages ranged from 9-0 to 15-8, and almost all of 
them had been referred to the Bureau for behaviors 
which brought them into conflict with school and court 
authorities, i.e., truancy, stealing, disorderly behavior 
in school, running away, etc. 

In order to check the three hypotheses, measures of 
fantasy aggression, fear of punishment, and aggressive 
behavior were required. The amounts of each 5S’s 
fantasy aggression and anticipation of punishment were 
devermined by analyzing his responses to TAT cards 1, 
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3BM, 4, 6BM, 7BM, 8BM, 12M, 13B, 14, and 18GF. 
These were administered in the standard way within 
two days after S arrived at the Bureau and before he 
joined the cottage group to which he was assigned for 
the remainder of his stay. 

In the analysis of the stories, any act or thought of a 
hero of a TAT story which, implicitly or explicitly, had 
as its “goal response...injury to an organism (or 
organism surrogate)” (2, p. 11) was assumed to be a 
reflection of an aggressive need on the part of S. A 
fantasy aggression (FA) score was derived for each S by 
simply counting the number of times aggressive acts 
appeared in his 10 stories. The following were considered 
acts of aggression: fighting, killing, criminally assault- 
ing; getting angry, hating, quarreling, cursing; criticiz- 
ing, blaming, ridiculing; breaking and smashing objects; 
escaping restraint, running away, resisting coercion; 
being negativistic, resisting authority, lying, cheating, 
stealing, gambling; forcing someone to change his be- 
havior or ideas; domineering or restraining someone; re- 
jecting, scorning, or repudiating someone; suicide, self- 
injury, self-depreciation. The occurrence of death, 
illness, or accident of the parents in a story was regarded 
as an indirect expression of hostility and hence scored 
as a fantasy aggression. 

A measure of S’s fear of punishment was derived in 
an analogous way. A punishment (P) score was ob- 
tained by cumulating the number of times the heroes 
of the stories were subjected to punishment press. As in 
the case of aggression, punishment was very broadly 
defined and any of the following was considered an 
instance of this press when it was directed toward the 
hero: punishment, assault, injury, killing; hate, threat, 
quarreling; deprivation of some privilege, object, or 
comfort; force, domination, restraint; physical handi- 
cap such as blindness, etc.; rejection, scorn, repudiation. 

Suicide, self-depreciation, and death, illness, or 
accident of parents or other loved objects were scored 
as both fantasy aggressions and punishment press. The 
first two are obviously self-punitive; the last are 
included because a broad definition of punishment 
includes “injury to a loved object”’ (2, p. 34). 

Observation of the overt aggressive behavior of S 
began the day he entered his cottage group and con- 
tinued for two weeks. Five attendants and a handicraft 
teacher served as observers (Os) and recorded on two 
forms, a weekly rating scale and a daily behavior report 
that were designed to facilitate and objectify 
observations. 

The daily behavior report, the first measure of 
aggressive behavior, consisted of a check list of twelve 
kinds of aggressive behavior: physical attack, bragging, 
threatening, teasing, saucy-impertinent, insulting name- 
calling, ridiculing, bullying, verbal castigation, mali- 
cious gossip, destructiveness, and temper tantrums. 
Each 0 filled out one of these reports each day for each 
of the Ss under his care, checking the appropriate space 
if that type of behavior occurred that day. 

The total number of incidents of aggressive behavior 
as indicated by the total number of checks on the forms 
was tabulated for each child. Unfortunately, despite 
the original plans, not all Ss were observed for the same 
number of days or by the same number of Os; conse- 
quently, the number of reports submitted was not the 
same for all Ss. In order to get comparable measures of 
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aggressive behavior, the total number of checks re- 
corded for each S was divided by the number of reports 
submitted for him multiplied by 12, the number of 
aggressive behaviors listed on each report. This ratio, 
henceforth designated DBR, may be regarded as the 
proportion of the aggression observed to the total 
amount which theoretically could have been observed. 
These ratios ranged from .000 (least aggression) to .398 
(most aggression), with a mean of .088 for the 29 Ss. 

The second measure of aggression, the weekly 
rating scales, consisted of five separate scales involving 
different aspects of aggression: uncooperative-coopera- 
tive, amiable-quarrelsome, aggressive-submissive, 
docile-rebellious, and antagonistic-friendly. These 
scales, which were designed to obtain an over-all 
measure of aggressive behavior, were completed by each 
O for each S under his care at the end of the first week 
of observation and again at the end of the second. 
Each of the five aggression scales was divided into 11 
equal units and scored according to the unit in which 
the rating check fell, 1 representing the minimum 
amount of aggression and 11 the maximum. The weekly 
rating-scale score for each S was the sum of the scores 
for the five scales. Since the number of reports sub- 
mitted varied among Ss, average weekly rating (WRS) 
scores were used. For the 29 Ss, these scores ranged 
from 11.7 (lowest aggression) to 47.0 (highest aggres- 
sion), with a mean of 29.38. The rank-order correlation 
between these two behavioral measures of aggression 
was .86. 


RESULTS 


The two types of data obtained—scores for 
fantasy aggression and fear of punishment, 
and the objective aggressive behavior scores 
—were used in testing all three hypotheses. 
For this purpose, all distributions were dichot- 
omized into high (median and above) and 
low (below median) groups. 

If the first hypothesis is valid, there will be 
a positive relationship between FA score, de- 
rived from the TAT, and aggressive behavior 
scores on the weekly rating scales and the 
daily behavior reports. Table 1 shows the 
number of individuals with high and low FA 
scores who received high and low DBR ratios. 
The probability of obtaining this set of cell 
frequencies or all other possible sets which 


TABLE 1 
DISTRIBUTION OF AGGRESSIVE BEHAVIOR ScORES 
AMONG SuByEcTs HicH AND Low IN FANTASY 
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would be even more extreme (i.e., more 
favorable to our hypothesis) was calculated 
directly by the method suggested by Fisher 
(3, p. 101). The value obtained was p = .046. 

The distribution of high and low FA scores 
was exactly the same in the case of the high- 
low dichotomy based on weekly behavior re- 
ports as it was in the case of the daily behavior 
reports; hence the same probability value was 
derived in testing the relationship between FA 
and the second measure of behavioral aggres- 
sion. 

As a further test of Hypothesis 1, the DBR 
ratios and WRS scores of the 10 Ss with the 
highest FA scores were compared with those 
of the 12 Ss with the lowest FA scores. Eight 
of the 10 highest FA scorers also received high 
(median or above) WRS scores and high DBR 
ratios, while 9 of the 12 lowest FA scorers also 
received low (below median) scores on the 
two behaviorial measures of aggression. The 
direct calculation of the probability of ob- 
taining this set of cell frequencies, or ali other 
possible more extreme sets, yielded a p of .015. 
This finding also indicates a strong positive 
relationship between covert and overt aggres- 
sion. 

It may be concluded from these findings 
that the first hypothesis is strongly supported 
and that, among a group of lower-class children, 
those with a greater amount of fantasy aggres- 
sion manifest more overt aggression than those 
with a smaller amount of fantasy aggression. 

Hypothesis 2 was tested by making a specific 
prediction concerning the relationship between 
a measure of TAT punishment press relative 
to TAT aggressive needs (P scores divided by 
FA scores to yield a P/A ratio) and the amount 
of overt aggression expressed as measured in 
this study. It may be predicted from the 
hypothesis that Ss having high P/A ratios 
will have low overt aggression scores; Ss having 
low P/A ratios will have high overt aggression 
scores. 

The number of individuals with high 
(median and above) P/A ratios and low (below 
median) P/A ratios in the high (median and 
above) and low (below median) categories of 
DBR and WRS is shown in Table 2. The dis- 
tributions for the two measures of overt ag- 
gression were identical. 

The probability of obtaining this set of cell 
frequencies or all other possible sets more 
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TABLE 2 
DISTRIBUTION OF AGGRESSIVE BEHAVIOR SCORES 
AMONG SUBJECTS WITH HiGH AND Low PuUNISHMENT- 
AGGREssion (P/A) Ratios 
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favorable to the hypothesis was p = .165. 
Although there is not a statistically significant 
difference between the proportion of Ss in 
the high and low P/A groups having high or 
low overt aggression scores, the difference is in 
the predicted direction and may therefore be 
considered a trend which is mildly supportive 
of the hypothesis. 

Of the 10 Ss scoring highest in the P/A, 
seven received low DBR and WRS ratings, 
while of the 10 Ss having the lowest P/A 
ratios, only four received low overt aggression 
scores while six received high scores. The 
probability of obtaining this set of cell fre- 
quencies (or all other possible sets more 
favorable to the hypothesis) was calculated 
directly and the value yielded was p = .153. 
Again the finding may be regarded as favorable 
to the hypothesis since the direction of dif- 
ferences between the high and low P/A groups 
is as predicted. 

It will be recalled that the third hypothesis 
was a synthesis of the first two. Translated 
into the measures used in this investigation, it 
states that individuals who have high FA 
scores together with low P/A ratios will have 
high scores on DBR and WRS, while individ- 
uals who have low FA scores and high P/A 
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TABLE 4 


DisTRIBUTION OF WEEKLY RaTING ScaLE (WRS) 
Scores AMONG Susyects HicH IN FANTASY 
Accression (FA) Bot Low IN PUNISHMENT- 
Accression (P/A) Ratios AND SusByects Low IN 
Fantasy AGGRESSION (FA) Bct HicH In PuNisH- 
MENT-AGGRESSION (P/A) RATIOS 
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ratios will have low overt aggression scores. 
Among the 29 Ss, there were seven who had 
high (median and above) aggression needs 
together with low (below median) P/A scores 
and nine who had both low FA scores and high 
P/A scores. The number of individuals in 
each of these two groups in the high and low 
DBR categories is shown in Table 3, while the 
corresponding distribution in the high and low 
categories of WRS scores is shown in Table 4. 
Direct calculation of the probabilities of ob- 
taining these sets of cell frequencies yielded 
the highly significant values of p = .003 in 
the case of Table 3, involving the DBR, and 
p = .020 for Table 4, involving WRS scores. 
These findings are clearly supportive of Hy- 
pothesis 3. Among those who have high fantasy 
aggression but relatively little fear of punish- 
ment, there tends to be a great deal of overt 
aggressive expression; among those who have 
few fantasy aggressive needs and relatively 
great fear of punishment, there tends to be 
little overt aggression. 


DISCUSSION 


The results of this study support the hy- 
pothesis that those lower-class Ss who have a 
relatively high number of aggressive needs on 
the TAT show more overt aggressive behavior 
than Ss having a relatively low number of 
fantasy aggressive needs. It was also found 
that Ss who scored high on punishment press 
relative to aggressive needs in their TAT 
stories tended to express less overt aggression 
than Ss who scored low on punishment press 
relative to their aggressive needs. However, 
this second relationship was less marked than 
that between TAT aggression and overt ag- 
gression. When a high number of aggressive 
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needs on the TAT is accompanied by a low 
ratio of anticipation of punishment to ag- 
gressive needs, there is a very strong likelihood 
that there will be a relatively great amount of 
overt aggressive expression; where a low num- 
ber of aggressive needs on the TAT is found 
with a high ratio of anticipation of punish- 
ment to aggressive needs, there is little 
probability that a great amount of aggression 
will be expressed overtly. 

These findings help delineate the conditions 
under which fantasy aggression may be used 
to predict overt aggressive expression, and 
consequently may be very useful in the in- 
terpretation of projective materials. For lower- 
class boys, at least, the amount of aggressive 
need shown by an individual in his TAT 
stories is some indication of the amount of ag- 
gression he will show in behavior. However, 
before we make predictions of overt behavior 
on the basis of the amount of aggressive need 
appearing in TAT stories, some attention 
should also be paid to the amount of punish- 
ment press relative to the aggressive need 
present. These conclusions should not be over- 
generalized since, as has already been pointed 
out, earlier studies have shown that in other 
social classes and age groups the relationship 
between covert and overt aggression may be 
negligible or even negative. These findings 
may not be at all applicable to members of a 
social class which has rather rigid taboos 
against the expression of aggression, or to 
older people whose internal controls of ag- 
gressive expression may be more firmly estab- 
lished. 

It is probably impossible to make any over- 
all statements about the validity of a projec- 
tive test such as the TAT, which may be used 
to assess many different aspects of personality. 
As Tomkins (9) has pointed out, attention 
must be focused on whether or not specific 
inferences based on TAT protocols are valid. 
If effectiveness in predicting behavior in social 
situations is accepted as a criterion of validity, 
the findings of this study may be regarded as 
evidence for the validity of TAT inferences 
concerning broadly defined aggressive needs. 
Further research is necessary to elucidate the 
conditions under which other kinds of in- 
ferences from the TAT are valid. 

Some of these data may be interpreted in 
terms of support of that aspect of the frustra- 
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tion-aggression hypothesis which states that 
inhibition of the expression of overt aggres- 
sion will vary with the degree of fear of punish- 
ment. The second and third hypotheses of this 
investigation both concern the relationship of 
fear of punishment to the expression of ag- 
gression. The partial support for the second 
hypothesis, together with the strong support 
for the third, may be taken as evidence in 
favor of the validity of the hypothesis that 
the expression of overt aggression is inversely 
related to the amount of punishment expected. 


SUMMARY 


The interrelationships among aggressive 
needs, anticipation of punishment, and overt 
aggressive behavior in 29 lower-class boys at 
the Bureau of Juvenile Research in Columbus, 
Ohio, were investigated in the present study. 
Analyses of Ss’ TAT protocols yielded meas- 
ures of strength of aggressive needs and fear 
of punishment, while ratings and behavior 
reports submitted by attendants provided 
indices of the amount of overt expression of 
aggression. 

There was strong support for the first 
hypothesis of the study, which stated that 
among lower-class boys, those having a rela- 
tively great amount of fantasy aggressive 
needs indulge in more overt aggressive be- 
havior than those who have relatively few 
fantasy aggressive needs. 

As had been predicted from the second 
hypothesis, Ss whose TAT stories included a 
great deal of punishment press (i.e., fear of 
punishment) relative to the number of their 
aggressive needs demonstrated less overt ag- 
gression than Ss whose ratios of punishment 
press to aggressive needs were low. The rela- 
tionship between the punishment press—ag- 
gressive needs (P/A) ratio and the behavioral 
aggression was not marked, but the findings 
may be regarded as mildly supportive of the 
hypothesis. 

The third hypothesis, a synthesis of the 
first two, was strongly supported by the data. 
The hypothesis stated that Ss who have a 
great deal of fantasy aggression accompanied 
by a small degree of fear of punishment rela- 
tive to their fantasy needs (low P/A ratio) 
show more aggression in their behavior than 
those who have a small amount of fantasy 
aggression accompanied by a high degree of 
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FIXATION AND INHIBITION 


ALBERT EGLASH 
Mayor's Rehabilitation Committee, Detroit 


AILING to leap to an open window con- 

taining food, a fixated animal on the 

Lashley apparatus will jump at a locked 
window instead (11, p. 43). Two aspects of this 
behavior may be distinguished: (a) The rat 
always jumps toward the same side of the 
apparatus. This aspect of the behavior has 
been described as the fixating of a response. 
(6) The rat never jumps toward the alternative 
side of the apparatus. This aspect of the be- 
havior will be described here as the inhibiting 
of a response. 

Either aspect may be considered as causing 
the other. Depending upon which is seen as 
cause and which as effect, the problem may 
be described either in terms of fixation or in 
terms of inhibition. 


THE PROBLEM AS A FIXATION 


One view of the phenomenon emphasizes 
the animal’s customary response of repeatedly 
jumping to one side of the apparatus. This 
view, describing the behavior in terms of 
fixation, implies that a consistent response 
occurs. It places no limit upon the number 
of responses which fail to occur. In the open- 
window situation, there may be dozens of 
possible responses, none of which appears. 
The required consistency of behavior is in 
terms of the response which occurs. 

This view of the behavior describes the rela- 
tionship between fixation and inhibition by 
suggesting that the ongoing response is so 
strong that, persisting, it inhibits the appear- 
ance of any alternative response: a strong 
habitual response — an inhibition of alterna- 
tive responses. 

This view of the behavior is suggested both 
by Maier’s frustration theory (11, pp. 35, 39) 
and by Mowrer’s anxiety theory (14, pp. 355- 
357). 


THE PROBLEM AS AN INHIBITION 


I should like to discuss an alternative formu- 
lation of the problem. This formulation empha- 
sizes the animal’s repeated failure to jump to 
the alternative side of the apparatus. This 


view, describing the behavior in terms of in- 
hibition, implies that some response consis- 
tently fails to appear. It places no limit upon 
the number of responses which do appear. In 
the open-window situation, there may be 
dozens of responses, all of which occur. The 
required consistency of behavior is in terms of 
the response which fails to occur. 

This view of the behavior describes the rela- 
tionship between fixation and inhibition by 
suggesting that the inhibiting of the max- 
imally adaptive response forces the animal to 
choose substitutes: inhibition of one response 
— appearance of substitute responses. 


INHIBITION OR FIXATION? 


Can either of these formulations be invaii- 
dated? With respect to the variability and con- 
sistency which each implies, these two views 
of the phenomenon differ. Where fixation re- 
quires consistency of behavior, an inhibition 
view permits variability. 

Variability. Whereas fixation suggests rigid- 
ity or perseveration, the fixated animal remains 
as flexible as the normal. This flexibility, some- 
times taking the form of a surprising ingenuity, 
has been described by Hilgard (8, pp. 304-305), 
Maier (10, p. 20; 11, pp. 28, 42-43), and Wil- 
coxon (17) in their discussions of “abortive” 
behavior. 

In addition to displaying a variety of re- 
sponses, as described by Maier, the fixated 
animal may dive directly into the net, or leap 
so lightly that it fails to reach the window. It 
may assume a passive role: the air blast de- 
signed to force the jump blows the animal off 
the stand. 

The rat may jump at 45 or 90 or almost 180 
degrees to the habitual window. Feldman, 
Ellen, and Barrett’s film (5) shows an animal 
leaping onto the roof of the apparatus. The 
animal may leap at the nosepiece which sep- 
arates the two windows, or at the locked 
window’s ledge, and then crawl through the 
open window. 

While no one animal will use all of the above- 
described abortive responses, any given fixated 
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animal may use several of them. Even after 
developing a habitual method of responding to 
the negative card, the animal continues to 
vary its behavior according to the demands of 
the situation. It leaps to the habitual window 
whenever the positive card is there, then uses 
one or more avoidance responses whenever the 
negative card appears. 

In other words, while the normal animal 
avoids the negative card by leaping to the 
alternative window, the fixated animal ac- 
complishes the same end by jumping abor- 
tively. This adaptability suggests substitute, 
rather than “‘fixated,’”’ behavior. 

Consistency. Although in responding to the 
negative card, and in differentially responding 
to positive and negative cards, the fixated 
animal varies its behavior, it consistently fails 
to make the one response which is maximally 
adaptive: It does not leap to the open window. 
This suggests an inhibition of that response. 


OTHER INTERPRETATIONS OF THE BEHAVIOR 


While failure to display a maximally adap- 
tive response may indicate an inhibition of that 
response, alternative explanations are possible. 
Behavior may be considered a function of 
ability and motivation; unless the animal has 
learned to jump to the open window, and is 
motivated to do so, postulating an inhibition 
is superfluous. 


Ability 

Has the animal learned that, in this situa- 
tion, jumping to the alternative window is an 
appropriate response? All animals jump to 
that window during the initial training period; 
during the first few days of the insoluble 
problem, most animals try the window several 
times; in the soluble problem and in the open- 
window situation, the nonfixated animals re- 
turn to that response. Apparently it lies within 
the animals’ repertory. There is evidence, in 
other words, that the animals have acquired 
the ability to jump to the alternative window. 


Motivation 
Here we shall seek evidence that the animal 
is motivated to go to the open window, and is 
not motivated to avoid that window. 
Preference. In place of using an abortive 
response, does the fixated animal prefer to 
jump to the open window? 
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a. One source of data is the animal’s pre- 
fixation behavior. During the initial training 
period, an animal can leap either to an open 
window or into the net. Invariably, it jumps 
through a window to the feeding platform. 

b. When the negative card is in the habitual 
window, the window is locked, and the animal 
consequently falls into the net. When the posi- 
tive card is there, the window is unlocked, and 
the animal lands on the feeding platform. 
Resistance to jumping increases whenever the 
fixated animal is confronted with the negative 
card (11, p. 41). 

c. Finally, a direct test of preference may 
be made. When offered a choice between the 
open window on the alternative side and the 
net, the animal dives into the net. It is possible 
to offer the same choice on its habitual side. 
This is done by placing the negative card in 
the alternate window, leaving the habitual 
window open. Rather than dive into the net, 
the animal now jumps through the open win- 
dow. 

Avoidance. Even though the animal prefers 
to land on the feeding platform rather than in 
the net, an accurate description of the be- 
havior might still be given in terms of the 
consistent avoiding of the alternate window. 
Rather than inhibiting a response, the animal 
may be avoiding a location. 

Feldman (5; 11, p. 48) has provided a test 
of this hypothesis. After demonstrating that 
the animal would not leap to the open window, 
Feldman built a bridge or runway from the 
jumping platform to that window. The animal, 
in place of jumping, could now get to the feed- 
ing platform by walking. 

Offered this bridge, the fixated animal 
walked to the window. When the bridge was 
removed, the animal returned to its abortive 
jumping. 

Two aspects of these results seem relevant 
to the hypothesis that the behavior is an 
avoidant response: 

a. An animal which, up to that time, had 
failed to jump to that window now walked 
there, indicating that it was not motivated to 
avoid the window. 

6. The animal’s relearning that food was 
obtainable in that window had no effect upon 
its behavior. Mowrer (14, p. 513) has shown 
that an avoidance response, when no longer 
appropriate, ends. The relevant variable in 
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fixation is evidently a response rather than a 
location (11, p. 48). 

In short, there is evidence that the animal 
is both motivated and able to jump to the 
alternate window. The one consistent aspect of 
its behavior is its failure to do so. 


INHIBITION AS AN ANSWER 


The hypothesis that we are dealing with the 
inhibiting of a response seems consistent with 
observations that have been made both of 
human and of animal behavior. 

1. Even after showing differential resistance 
to positive ard negative cards, the normal as 
well as the fixating animal persists in leaping 
toward its habitual side, and each may use 
abortive responses (17, pp. 329-331); +e 
normal animal, however, soon drops this re- 
sponse and leaps to the alternate window. Con- 
sequently, only in what it fails to do does the 
fixated animal differ from the normal. 

Maier (11, pp. 147-148) has observed that 
some neurotic animals differ from normals only 
in the inhibiting of certain functions, while 
Freud (6, p. 11) has made the same observa- 
tion about human neurotics. 

2. The animal’s differential resistance tu 
positive and negative cards, and its use of 
abortive avoidance responses, suggests that 
with the same goals as the normal animal (to 
avoid punishment and gain rewards), the 
fixated animal takes substitute, ineffective 
paths (6, p. 17). If “behavior without a goal” 
(11, p. i; 16, pp. 25-28) means the behavior of 
an organism without a goal, this designation of 
fixated behavior is a misnomer. 

Like a wrong turn in a maze, or a miss in a 
dart game, fixated behavior might be described 
as misdirected or deflected. The behavior also 
suggests an analogue of reaction formation (6; 
9, pp. 193-194; 14, p. 386; 15, pp. 460-461): 
motivated to jump toward the food in the 
open window, the fixated animal jumps away 
from it. 

Animal fixation, like human neurosis (2, pp. 
13-14), is a failure of an organism effectively to 
reach its goals; at best, fixations, like neurotic 
symptoms (6, p. 26), are ““compromise”’ solu- 
tions (15, p. 423). If “behavior without a goal”’ 
means that the fixated response fails to get the 
animal to its goal, or is relatively ineffective, 
the description seems accurate. 
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3. That a state of frustration or anxiety is 
responsible for the animal’s behavior is not 
supported by the experimental findings: 

a. Specificity of the behavior. Were the ani- 
mal’s emotional staie the cause of the behavior, 
this behavior would be general rather than 
specific. The fixated animal, generalizing its 
anxiety even to its home cage (11, p. 151), still 
shows normal adaptability in other problem 
situations (11, pp. 47-48). 

b. Effectiveness of guidance. When first of- 
fered guidance, the fixated animal, like the 
neurotic human (7, p. 253), shows strong re- 
sistance (11, p. 53 and Fig. 7). Were the ani- 
mal’s behavior frustration-instigated or anxiety- 
induced, the blocking of one response and the 
forcing of another, which increase frustration 
and anxiety, would be ineffective in ending the 
behavior. Instead, forcing the animal’s re- 
sponse ends (11, p. 53) and prevents (12) ab- 
norma! behavior. 

This raises the question of why guidance is 
effective. If the primary problem is the fixating 
of a response, then the importance of guidance 
is that it “prevents the expression of the fix- 
ated response” (11, p. 79). If the primary 
problem is the inhibiting of a response, then 
guidance is effective because it “leads the 
animal through an alternate response” (11, p. 
79). In applying guidance to alcoholism and 
kleptomania, Maier emphasizes the need to 
help the individual perform an act he has been 
unable to perform, rather than to prevent him 
from performing an act he feels he must per- 
form: 


If one applied the guidance method to a persistent 
type of behavior such as dipsomania, it would seem that 
the procedure would entail giving the alcoholic the 
experience of declining drinks. The method of attempt- 
ing to keep the alcoholic away from drinks avoids the 
problem... . 

Applied to the illustration of stealing in frustrated 
cases, it seems that the procedure would be a matter of 
taking the child to the store a few times and having 
him make actual purchases (11, pp. 162, 175). 


c. Ineffectiveness of reward. Rewarding ex- 
periences, which reduce frustration and anx- 
iety, should end frustrated, anxious behavior, 
as in Farber’s experiment (4). Instead, allowing 
the animal to rest in its cage for months (11, 
p. 44), to solve other problems successfully 
(11, p. 47), to walk from jumping stand to 
feeding platform (5; 11, p. 48), or to jump with 
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the habitual window unlocked,' does not end 
fixation. Maier (p. 175) notes that, at. the 
human level, fixated behavior may persist 
after the frustration is alleviated. 

The view that the behavior is a function of 
anxiety or frustration is not consistent with its 
specificity, with the effectiveness of guidance, 
or with the ineffectiveness of reward. An al- 
ternative view is that the animal’s frustration 
is a function of its inhibition. Without this 


inhibition, the animal in the open-window ° 


situation would experience no frustration. A 
frustrating situation is one in which a barrier 
prevents an organism from reaching its goal. 
In the open-window situation, where there is 
no external barrier, the barrier must be some 
aspect of the organism itself. Maier (11, p. 151) 
makes a similar suggestion in connection with 
human neurotics. The organism may become 
frustrated as a result of a conflict between its 
“excitatory tendency and an opposing inhibitory 
tendency” (1, p. 481).? Rather than frustration 
— fixation, the relationship may be inhibition 
— frustration. 

Similarly, in the open-window situation 
there is no external basis for fear. However, 
when the fixated rat inhibits a learned response, 
it apparently regresses to an initial or unlearned 
reaction (8, pp. 83, 94; 11, p. 40): it resembles, 
in its behavior, an untrained animal (11, pp. 
144-145); in its anxiety, a naive animal (9; 11, 
p. 131). Rather than anxiety — fixation, the 
relationship may be inhibition — frustration 
— regression —> anxiety (6, p. 40). 


An unpublished study by the writer, carried out 
at the University of Michigan with the cooperation of 
Drs. N. R. F. Maier and Paul Ellen. It was this experi- 
ment which first indicated that some modification must 
be made in the present formulation of frustration the- 
ory. 

2 The fixated animal acts as if it had internalized a 
barrier to its goal. May (13, pp. 337-344) has made a 
similar observation about human neurotics. He found 
that neurotic anxiety occurred in those who, denying 
the objective reality of a parental rejection, internalized 
the parent-child conflict; this became the internal con- 
flict giving rise to anxiety. 

Maier (11, pp. 132-133) describes an internal con- 
flict between a frustration-instigated compulsive im- 
pulse and a fear-motivated inhibition of the impulse. 
One day a youngster at the beach said that he would 
like to swim under water, but was afraid. Here was the 
opposite kind of conflict from that described by Maier: 
the impulse was felt as motivated, the inhibition as 
compelling. It was this incident which first suggested 
the possibility of reinterpreting Maier’s data in terms 
of inhibition. 


Atszrt EcLasH 


INHIBITION AS A PROBLEM 


The substitution of “inhibition” for “fixa- 
tion” does not explain the behavior, but only 
describes it. Rather than solving the problem 
of behavior which “is at one and the same time 
self-defeating and yet self-perpetuating” (14, p. 
351), this inhibition constitutes the problem. 

While a cognitive theory can explain much 
of Maier’s data, it leaves this central problem, 
the failure of the animal to leap to the open 
window, unsolved (3). Reinforcement (anxiety- 
reduction) theory will also account for much 
of the data, but Mowrer points out that this 
too leaves the essential problem unsolved: 


The only question which Maier’s results leave un- 
answered is why Maier’s subjects show less behavior 
variation when fear-dominated than when hunger- 
dominated.... Several investigations... have indi- 
cated that living organisms behave less freely, less 
flexibly when being subjected to punishment than when 
operating solely under the influence of reward (14, p. 
357). 


In one sense it is not true that the fixated 
animal behaves less flexibly. It has a repertoire 
of responses larger than that shown by the 
normal animal. In a more important sense, 
however, the animal is indeed less free, for the 
maximally adaptive response is not displayed. 

This is the same dilemma which, at the 
human level, Freud described as unsolved (6, 
pp. 89-92). To explain this inhibition is to 
solve the problem of animal fixation, and the 
explanation may, by analogy, help solve the 
riddle of human neurosis. 


SUMMARY 


To the current view that animal fixation 
occurs when a persistent response inhibits the 
expression of more adaptive behavior, an al- 
ternative view is offered: An underlying inhibi- 
tion leads to substitute behavior. This view 
seems consistent both with Maier’s experi- 
mental findings with animals and with his 
application of the derived principles to human 
guidance. 
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ETHNOCENTRISM AND MISANTHROPY? 
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T HAS been clear for some time that 
| persons who are prejudiced against a 
particular ethnic minority do not ordi- 
narily limit their antipathies to the group in 
question. Those who express an animus against, 
say, Negroes will tend also to express hostile 
attitudes toward a diversity of other groups, 
including some which are fictitious (8). The 
ethnocentric person, despite his willingness to 
adduce specific reasons for rejecting designated 
groups, nevertheless appears to be oriented 
toward a general rejection of others. 

Recent researches into prejudice, particu- 
larly those of the University of California 
Public Opinion Study (hereafter, UC-POS) 
(1), have given further documentation to this 
observation. Perhaps the most important 
discovery stemming from these investigations, 
however, has been the finding that ethnic 
prejudice is somehow rooted in, or at the 
very least associated with, personality or- 
ganization. The persons designated as preju- 
diced are found to have undergone particular 
processes of socialization (1, 6), are shown to 
utilize peculiar vocabularies of motives (1), 
are seen to differ from the unprejudiced with 
respect to such ostensibly peripheral functions 
as memory (4), cognition (13), social percep- 
tion (14), and so on. This insight into the 
nature of prejudice—that it is functional to 
personality—enables us to understand the 
ubiquitous antipathies of the ethnocentric. 

Studies linking authoritarianism to ethno- 
centrism have yielded still other suggestions: 
the early interpersonal experiences of these 
persons have been such that they are given to 
cynicism regarding the motives of others (5), 
a perception of others as hostile and threaten- 
ing (1, 5, 14), a general suspiciousness (1, 7); 

1 The authors are indebted to Drs. H. G. Gough 
and M. B. Freedman for their aid in the preparation 
of this paper. 
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in short, they are marked by an apparent 
misanthropy. 

Turning now to the measuring instruments 
used in the investigations of ethnic attitudes— 
usually, direct questionnaires—it is seen that 
the usual item can be separated into at least 
two elements: (a) a reference to a particular 
ethnic minority and (6) a statement, usually 
derogatory although often couched in pseudo- 
democratic terminology, pertaining to some 
presumed attribute of the group in question, 
e.g., sensuality, greed, clannishness, and so 
on. It is assumed, if only implicitly, that there 
is a necessary linkage in the imagery of the 
respondent between the minority group and 
the invidious imputation. If, however, we 
formulate ethnocentrism as misanthropy, the 
possibility arises that assent to statements 
derogatory toward a specific group may 
reflect or express hostility toward people in 
general—or, that the designation of the 
particular outgroup is not as important in 
securing assent as is the imagery of hate and 
exclusion.’ The study to be reported is an 
attempt to test this hypothesis. 


METHOD 


A 29-item questionnaire was developed in the follow- 
ing manner: Preliminary Ethnocentrism scale items 
used in the UC-POS were revised so that references to 
specific minorities were omitted; the words “people,” 
“most people,” or “human(s)” were substituted. It 
was found necessary to eliminate from consideration 
a few items because they related too directly to stereo- 
types about particular groups, e.g., the Jews’ alleged 
crucifixion of Christ. As may be seen from the examples 
given below, elimination of these items did not con- 
stitute a serious departure from the plan of retaining 
the derogatory imputations. Aside from the substitu- 





3 The existence of such an underlying misanthropy 
is suggested by Gough (6). He points to the existence 
of sets and attitudes which obscure the measurement 
of sought-after variables, citing the K factor in the 
Minnesota Multiphasic Personality Inventory as an 
example (10). Gough considers misanthropy to be a 
variable underlying and confounding the measurement 
of ethnocentrism. 
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tions, the original wording was retained except in 
instances where sensibleness necessitated slight revi- 
sions. Typical of the 29 items‘ are the following, listed 


with their original counterparts: 


Original 

Jews seem to prefer the 
most luxurious, extrava- 
gant, and sensual way of 
living. 

Jews tend to lower the gen- 
eral standard of living by 
their willingness to do the 
most menial work and to 
live under standards that 
are far below average. 


Jews should be more con- 
cerned with their personal 
appearance and not be so 
dirty and smelly and un- 
kempt. 

The Negroes would solve 
many of their social prob- 
lems by not being so irre- 
sponsible, lazy, and igno- 


Revised 
People seem to prefer the 
most luxurious, extrava- 
gant, and sensual way of 
living. 
Most people tend to lower 
the general standard of 
living by their willingness 
to do the most menial 
work and to live under 
standards that are far be- 
low average. 
People should be more 
concerned with their per- 
sonal appearance and not 
be so dirty and smelly 
and unkempt. 
People would solve many 
of their social problems 
by not being so irrespon- 
sible, lazy, and ignorant. 





rant. 


The scale thus devised (hereafter denoted as M, for 
misanthropy) was administered to 221 students of 
elementary psychology at a midwestern university. 
The subjects (Ss) at the same session completed a 
UC-POS 20-item Ethnocentrism (E) scale (1). It should 
be made clear that the E scale statements did not dup- 
licate in phrasing any of the M scale items; that is, 
the M scale items used in the study were not among 
those included in the final, 20-item E scale but were 
drawn from the larger sample of preliminary items used 
in evolving the E scale. Two of the four classes tested 
received an M-E order of presentation; the order was 
reversed for the other two. It may be noted that the 
mean E and M scale scores for the two sequences of 
administration did not differ significantly from each 
other: the p values for the critical ratios were .23 and 
.62 for E and M, respectively. 

The Ss were permitted six categories of response for 
each item, ranging from complete agreement (+3) to 
complete disagreement (—3), a middle category being 
excluded. The responses were translated into a seven- 
point scoring scale, with maximum disagreement re- 
ceiving one, maximum agreement seven, and omissions 
four points. Low scores, then, indicated relative absence 
of prejudice and misanthropy. 





*A more detailed form of this scale has been de- 
posited with the ADI. Order Document 4034 from the 
ADI Auxiliary Publications Project, Photoduplication 
Service, Library of Congress, Washington 25, D. C., 
remitting in advance $1.25 for photoprints or $1.25 
for 35 mm. microfilm. Make checks or money orders 
payable ito Chief, Photoduplication Service, Library 
of Congress. 
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RESULTS 


The test of the major hypothesis is in the 
degree of association between the two scales; 
the product-moment correlation is .43. The 
corrected odd-even reliabilities for E and M 
are .84 and .79, respectively. Correcting for 
attenuation, we obtain an r of .53. Both 
correlation coefficients are significant at the 
.001 level of confidence. 


DISCUSSION 


The meaning of the results may be illumi- 
nated if we imagine polar opinions concerning 
the nature of prejudice. One extreme would 
insist that prejudice is primarily, if not entirely, 
determined by personality dynamics, that if 
minority groups did not exist inner needs 
would invent them. An opposing view would 
hold that prejudice is unrelated to needs and 
motives, that it is attributable to the specific 
social and historical conditions which place 
certain groups at a disadvantage. To the 
extent that the present data are relevant to 
this dialectic, they suggest a middle position: 
On the one hand, misanthropy—which would 
conceivably be one of the key central variables 
to which proponents of the first view would 
reduce all prejudice—is associated with ethnic 
attitudes, as witness our correlation. On the 
other hand, the correlation is certainly not 
large enough to demonstrate that prejudice is 
isomorphic with an underlying misanthropy 
or that the designation of particular minorities 
as objects of hate is adventitious, i.e., free from 
social press. 

That our major hypothesis should have been 
substantiated by the results is not surprising 
when we remember that previous studies have 
indicated the prevalence of hostile, suspicious, 
cynical feelings in the authoritarian toward 
others. What is worthy of note here is the 
generality of reference of these attitudes. It 
appears, at least on the surface, that for many 
of the antidemocratic there may be no ingroup 
other than the self. Whereas previous concep- 


* Indeed, antipathies may even be directed toward 
the self or aspects of the self. Erikson (3, p. 215) sees 
the rejection of minorities as a manifestation of in- 
tolerance of tendencies within the self which arise 
genetically from unresolved nuclear conflicts. Again, 
Rogers (12) and his associates have stressed the salience 
of negative attitudes toward the self concept in mal- 
adjustment. 
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tualizations of ethnocentrism have stressed 
ingroup-outgroup dichotomization, it may 
ultimately be shown that this dichotomy does 
not exist for many of the high-scoring persons 
and that they are totally exclusionistic. The 
outstanding example of such a person is the 
paranoid schizophrenic, who, it appears from 
clinical observation, is very strongly charac- 
terized by the tendencies and predispositions 
we have been discussing and who, we know, is 
almost completely estranged from others. 

If we direct our attention again to the com- 
ponents of the usual questionnaire used in 
antiminority researches, the fact that there is 
reason to question the existence of ingroups for 
some prejudiced persons may be more clear. 
What we find, to repeat, are (a) designations 
of specific groups with (5) certain charac- 
teristics attributed to them: The outgroups 
are concretely specified and defined by the 
items and by the pre-existing stereotypy ap- 
pealed to. For the ostensible ingroup, no such 
concreteness of specification and description is 
provided. Rather, the assumption is made that 
“opposite” groups with “opposite” attitudes 
directed toward them must exist on the basis 
of the information gathered concerning out- 
group attitudes. This is not to say that there 
may not be ingroups with which many ethno- 
centric persons are identified, but that evidence 
derived from uirect questionnaires does not 
provide an adequate basis for this conclusion. 

An exception to the practice of limiting 
scale items to the two components noted above 
is offered by the Ethnocentrism scale, which 
was designed with an eye toward representing 
ingroups in the items. For example, some of 
the items concerning Negroes indicate whites 
as the ingroup, Jews are opposed to Christians, 
etc., so that what emerges is an ingroup com- 
prised of “native, white, Christian Ameri- 
cans” (1, p. 109). Adding these ingroup 
references leads to the necessity of making 
assumptions concerning the characteristics of 
the ingroup. These characteristics may not 
typify the ingroups of all the ethnocentrics. 
Rather than “defending” an ingroup with 
which they are identified, these persons may 
only be availing themselves of the opportunity 
to express exclusionistic feelings, toward the 
specified groups or generally. It is suggested, 
therefore, that the introduction of references 


to ingroups into antiminority group ques- 
tionnaires, based as they are on a priori de- 
terminations, results in the confusion of 
variables and should be avoided. 

Underlying the foregoing is the important 
conceptual question of whether investigations 
into attitudes toward groups are best guided 
by a psychological or by a sociological frame 
of reference. Consider, for example, the grounds 
on which one can make a priori determinations 
of ingroups—and, for that matter, outgroups. 
In the designation, native, white, Christian 
Americans, an essentially sociological defini- 
tion has been resorted to. Consider also 
the UC-POS definition of ingroup and out- 
group: “ ‘Ingroup’ and ‘outgroup’ are socio- 
psychological rather than purely sociological 
concepts, since they refer to identification and, 
so to speak, contra-identification, rather than 
to formal membership in the group” (see 1, 
pp. 146-147). In this and in related text, 
reference is made to a primarily psychological 
point of view. Yet, when the major outgroups 
are enumerated, it is apparent that a socio- 
logical orientation is used. We do not take 
issue with the right or the necessity of doing 
this, at least insofar as outgroups are con- 
cerned, but must cite the fact of a sociological 
view in order to make a further point. It 
might parenthetically be noted that raising 
such considerations should help us clearly to 
define the nature of the investigations under 
discussion: They have been focused on the 
extremely important problem of measuring 
tolerance and intolerance of socially and 
economically deprived and oppressed minori- 
ties. 

We would, however, separate sharply from 
this sort of interest that having to do with 
group identification and contraidentification 
and would contend that investigations into the 
latter phenomena must necessarily utilize 
different methods and different concepts. Iden- 
tification and contraidentification are differ- 
entiated here from ethnic tolerance and 
intolerance, for, although there may un- 
doubtedly be overlap, the two cannot be 
equated. Identification as a concept in custom- 
ary usage denotes a highly idiographic, dy- 
namic process, in which idiosyncratic percepts, 
needs, and feelings are of crucial importance 
and defy externally imposed definitions. Here, 
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then, a psychological approach must be used. 

The need for the separation of the tolerance 
and identification dimensions is suggested by 
Hartley’s study (8) in which a completely 
fictitious group was presented for rejection or 
acceptance. That many persons manifested 
intolerance of this nonexistent group revea!s 
their prejudiced attitudes toward the strange 
and the alien. It does not, however, demon- 
strate contraidentification, for this term 
denotes, among other things, the association 
with the group in question of a set of attributes, 
learned either through experience or the 
propagation of stereotypes. 

Our arguments, then, lead us to conclude 
that further research in this area is necessary. 
It is apparent that inferences concerning 
ingroup attitudes from data concerned with 
the outgroup are untenable, yet it is also clear 
that our own information does not provide 
sufficient support for a supposition of an utter 
absence of ingroup feeling among the mis- 
anthropic. The topics which need investigating 
are identity and identification. We need in- 
formation concerning both the existential 
functioning of the affiliative and self-regarding 
sentiments and the processes by which these 
dispositions become existent. There is con- 
siderable evidence that writers in every area 
of social science are similarly persuaded. In 
Toward a General Theory of Action (11) such 
diverse theorists as Parson, Shils, Tolman, and 
several others join in pointing to the salience 
and significance of the identification concept 
in contemporary social science research. And 
it is significant, we feel, that Erikson (3) has 
made the notion of identity so central to 
his conception of ego development and func- 
tioning, and that Burke (2) has used the term 
“identification” as a crucial one in his analysis 
of language. 

M as A SCALE 


Since the purpose of this study was the 
testing of a hypothesis rather than the 
construction of a reliable instrument for the 
measurement of the misanthropy variable, no 
attempt at further refinement of the M scale 
was made. A normal distribution of scores was, 
however, obtained; the reliability figure of .79 
appears adequate under these circumstances. 
It would seem that the present instrument 
should be readily available for use as a scale 
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after the omission of the less differentiating 
items and possibly the rewriting of others. The 
validation would, of course, be a more difficult 
though not an insuperable problem. Of par- 
ticular interest would be the replications of 
researches into various functions, e.g., memory, 
social perception, problem solving, etc., using 
M instead of or along with E in selecting 
experimental Ss. Whether M would provide 
a basis for better differentiation of Ss differing 
in these functions is, of course, a matter for 
empirical determination. The M scale would 
also appear to provide an instrument for 
identifying the nonmisanthropic among ethno- 
centrics, who would presumably be more sus- 
ceptible to re-education toward greater toler- 
ance. 


SUMMARY 


Twenty-nine items from existing scales of 
ethnic prejudice were rewritten so that the 
terms “people” or “most people” or “hu- 
man(s)” were substituted for the specific 
minorities originally designated. The scale 
thus constructed (termed M for misanthropy) 
was found to be correlated .43 (.53 when cor- 
rected for attenuation) with a 20-item version 
of the UC-POS scale for general ethnic in- 
tolerance. The results of the study were dis- 
cussed with reference to the possible connec- 
tions between prejudice and misanthropy. 
The implications of misanthropic imagery for 
attitude scale construction were noted, and 
certain methodological considerations were 
raised. Some suggestions were made pertaining 
to the need for revising our present conceptual- 
ization of ingroup-outgroup identification and 
contraidentification. Finally, it was indicated 
that the present instrument could, after some 
work, be used for subsequent research in this 
area. 
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CONFORMING BEHAVIOR OF PSYCHIATRIC AND 
MEDICAL PATIENTS! 


JACOB LEVINE, JULIUS LAFFAL, 

VA Hospital, West Haven, Connecticut 
MARTIN BERKOWITZ, JAMES LINDEMANN, 
Pennsylvania State University 
anp JOHN DREVDAHL 
Universily of Nebraska 


HE symptoms of the neurotic or emo- 
tionally disturbed individual usually 
find expression in difficulties in his 
interpersonal relationships, particularly the 
more intimate ones. We observe that the emo- 
tional difficulties of the neurotic are manifest 
also in some ways in his group behavior, either 
by failure to integrate into a group or by in- 
ability to conform to group standards. It is 
probably safe to postulate that the more 
severely disturbed an individual is in his 
interpersonal relationships, the more likely it is 
that that disturbance will be reflected in his 
group behavior. We may expect at least some 
disturbance in the neurotic’s integration into 
a group or in his ability to conform to group 
demands. But we have few systematic studies 
that would provide substance to this type of 
expectation. 

The present study was designed to test the 
assumption that the neurotically disturbed 
person will show some difficulty in his group 
behavior. On the basis of this assumption we 
may predict that one expression of this dif- 
ficulty will be the relative inability to adjust 
to or be influenced by group standards. The 
disturbed individual, hampered by anxiety and 
emotional conflict, will not readily shift his 
behavior to accord with that of the group of 
which he is nominally a part. 

In order to measure the influence of group 
norms upon the neurotic’s judgments, we have 
made use of the autokinetic phenomenon. In 
his study of group influences upon the median 
of autokinetic judgments, Sherif (5) found 
that the successive judgments of individuals 


1 This study was conducted at the VA Hospital, 
Newington, Connecticut, and is submitted with the 
approval of the Chief Medical Director, Veterans Ad- 
ministration, Washington, D. C. The opinions ex- 
pressed here are those of the authors and not of the 
Veterans Administration. 
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tended to converge toward a common group 
norm when they were tested in groups. He 
interpreted this finding as indicating a tend- 
ency of individuals as members of a group to 
conform to group standards. He further found 
that the convergence of the medians is not as 
close when subjects started with an individual 
session and were then brought into a group as 
when they began in the group situation. 
Finally, he observed that when a group mem- 
ber was subsequently tested alone he adhered 
closely to the group norm and range of judg- 
ments. Jenness (3) also found this type of 
convergence resulting from group influence in 
the estimation of the number of beans in a jar. 

Bovard (1) later found that group influence 
on convergence toward a group norm was 
greater in a group-centered than in a leader- 
centered group. He interpreted these findings 
to mean that there was much greater inter- 
personal interaction in the group-centered than 
in the leader-centered group. In another study, 
Levine and Butler (4) confirmed, with judg- 
ments of performance ratings, Lewin’s earlier 
finding that a group-centered group was more 
effective in modifying individual behavior to- 
ward more socially acceptable norms than was 
a leader-centered group. These studies all sug- 
gest that the degree of convergence of the 
judgments of individuals toward a group norm 
is largely determined by the amount of inter- 
personal interaction. 

On the basis of these findings, the present 
study was designed to test the hypothesis that 
the emotionally disturbed individual will inter- 
act less in the group situation than will the 
individual who is not so incapacitated. Spe- 
cifically, a comparison will be made with re- 
spect to the degree of convergence toward a 
group norm of autokinetic judgments between 
subjects (Ss) who were hospitalized because 
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of emotional difficulties and those who were 
hospitalized for minor medical illnesses. If our 
assumption is correct, the autokinetic judg- 
ments of the neurotics will not converge in the 
group situation to the same degree as will 
those of the nonneurotics. 


METHOD 


The method of inducing the autokinetic effect in 
the present study was essentially similar to that used 
by Sherif (5). The Ss, either singly or in groups of three 
to six, depending on the experimental session, were 
brought into a completely darkened room. They were 
told that they would see a pin point of light which would 
move, and that they were to estimate the greatest 
distance it moved in any one direction from the start- 
ing point. The instructions were: 

We want to see how well you can judge the 
movement of a point of light in complete dark- 
ness. In a moment the light will go out and 
when I say “Ready,” you will see a point of light 
before you. Watch this point. It wili begin to 
move. A few seconds later it will disappear. 
When it disappears 

(For individual sessions): I want you to call 
out immediately how far it moved. 

(For group sessions): Dr. ——- will call out 
your names individually and when he does I 
want you to call out how far you saw the light 
move. 

The point may move in several directions. 

I want you to give the furthest distance it 
moves in any one direction. Give your judg- 
ment in inches. 

After you have given your judgment the 
light will appear again when I say “Ready.” A 
few seconds later it will disappear. When it 
disappears... (appropriate instructions for 
individual and group)....We will do this 
several times. 

There were ten trials within each session, and there 
were four separate sessions. In each trial the pinpoint 
of light was exposed for 30 seconds. One E remained 
in the experimental lightproof room, and two Es were 
outside the room in an alcove separated from the ex- 
perimental room by opaque blanket drops. One of the 
outside Es operated the pin-point and room light 
switches in response to signals called by £ in the room, 
and one called out the names of Ss in the room when 
the time came for reporting judged distances, and re- 
corded the judgments which were given aloud by Ss. 

The Ss were all hospitalized patients at a VA general 
hospital and were divided into two major groups. One 
group of ten patients same from the open neuropsy- 
chiatric service. Those with psychotic symptoms were 
excluded from this group. The other group consisted 
of seven patients from the medical wards of the hospital 
with minor organic complaints but with no evident 
disabling emotional difficulties. Cases with the common 
psychosomatic illnesses were ruled out of this latter 
group. For purposes of differentiation between the two 
groups, we shall refer to the first as the “neurotic” and 
the second as the “nonneurotic” group. 
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The Ss in each of these two major groups were di- 
vided in random fashion into two subgroups which then 
differed according to the sequence of individual (I) 
and group (G) sessions. Thus one neurotic subgroup and 
©..€ nonneurotic subgroup started the experiment with 
an individual session in which each member of the 
group observed the pin-point light alone for ten trials, 
followed by three successive group sessions, spaced one- 
half day apart, in which the subgroup as a whole 
observed the pin-point light. The other two subgroups 
started the experiment with three group sessions spaced 
at half-day intervals, followed by an_ individual 
session at a one half-day interval. Four groups were 
thus obtained, identified henceforth as group A, the 
neurotic IGGG group; group B, the neurotic GGGI 
group; group C, the nonneurotic GGGI group; and 
group D, the nonneurotic IGGG group, according to 
the sequence of individual and group sessions. 

The fundamental datum in this study was the 
amount of variability shown wi'ain each trial by the 
groups of Ss. For each of the ten trials per session, 
variability was calculated by the formula 2x*/n — 1, 
which will be recognized as an estimate of variance. 
These variability scores, of which there were ten in 
each session, were then used as the basic scores in an 
analysis of variance to determine if the groups differed 
from each other in mean variability from session to 
session. A table similar to Table 1 was constructed 
showing the means of the variabilities for the groups 
and combinations of groups, and the standard devia- 
tions of these means, for each session. In this initial 
table it was evident that the means were related to 
their standard deviations, and that the data conse- 
quently failed to meet the requirements of normal 
distribution and homogeneous variance in analysis of 
variance. A logarithmic transformation of the vari- 
ability scores (2, p. 203) was effected by adding 1 to 
each score and converting to its logarithm. Table 1 
represents the means and standard deviations of the 
log-transformed variability scores. The effect of the 
transformation was to reduce the rank-order correla- 
tion between means and standard deviations from .84 
to .36, the latter being statistically nonsignificant. As 
a result of the log transformation it was possible to 
assume that the variances of the variability scores were 
homogeneous and normal among all groups, thereby 
meeting the requirements of the analysis of variance, 
which was then carried out. 


RESULTS 


Table 1 gives the logarithmically trans- 
formed mean variabilities of judgments of the 
autokinetic effect for each group and combina- 
tion of groups in each session. Group A, the 
neurotic [GGG group, had the greatest over-all 
variability. As a matter of fact, the variability 
of this group increased from session to session 
despite the shift to group sessions. Group D, 
the nonneurotic IGGG group, was next in 
over-all variability. But for this group there 
appeared to be a general decrease in variability 
in successive group sessions, a trend in direc- 
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TABLE 1 
MEAN VARIABILITY OF JUDGMENTS OF THE AUTOKINETIC EFFECT 
(Logarithmically transformed scores) 














SESSIONS 
Group* Type or S a --- 
1 2 3 4 Att SEssIONS 
A (6 Ss) Neurotic IGGG M .67 .86 1.05 1.17 -94 
SD 31 .33 .32 .16 1.65 
B (4 Ss) Neurotic GGGI M .79 .63 .72 .82 74 
SD .20 .25 34 .24 1.29 
C (4 Ss) Nonneurotic GGGI M .37 51 .22 .72 .45 
SD ae .27 15 .22 81 
D (3 Ss) Nonneurotic IGGG M -95 .96 .86 .85 .90 
SD 34 .39 .40 .39 1.58 
A and D Neurotic and IGGG M .81 91 .95 1.01 .92 
nonneurotic SD .36 .36 .38 34 1.62 
B and C Neurotic and GGGI M .58 a .47 ae 60 
nonneurotic SD 31 ae 36 .23 1.09 
A and B Neurotic IGGG and M 73 75 .88 .99 84 
GGGI SD .27 31 1.16 .26 1.48 
D and C Nonneurotic IGGG and M .66 va .54 .79 .68 
GGGI SD .42 .40 .44 .32 1.28 
M .70 74 71 .89 .76 


All groups 








* Means of groups A, B, C, and D are based on 10 variability scores within each session. Means of the combinations of groups are 


based on 20 scores, and means of all groups are based on 40 scores. 


TABLE 2 
TRENDS IN VARIABILITY OF JUDGMENTS OF THE 
AUTOKINETIC EFFECT FROM SESSION TO SESSION, 
BASED ON ANALYSIS OF VARIANCE 








SEQUENCE DIRECTION 


Group Type or S oFSeEssIONS OF TREND 





IGGG 


A Neurotic 


Increase .01 
B Neurotic GGGI None n.s. 
Cc Nonneurotic GGGI Mixed 01 
D Nonneurotic IGGG Decrease .01 
Aand Neurotic and IGGG None n.s. 
D nonneurotic 
Band Neurotic and GGGI Mixed 05 
nonneurotic 
Aand Neurotic IGGG and Increase .05 
B GGGI 
D and Nonneurotic IGGG and None n.s. 


Cc GGGI 


tion opposite to that of group A. The other 
neurotic group, group B (GGGI) ranked third 
in over-all variability, but there did not appear 
to be any consistent trend from session to ses- 
sion. The nonneurotic GGGI group, group C, 
ranked lowest in over-all variability. The great- 
est variability in this group appeared in the 
individual session after the three group sessions 
which otherwise showed no consistent trend. 
Table 2 shows the significance of the vari- 
ability trends for each of the individual groups 
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and combinations of groups, as compared to a 
horizontal line, based on analysis of variance. 
Groups A, C, and D showed significant dif- 
ferences in variability from session to session, 
whereas group B showed no such difference. 
Inspection of the mean values for group A 
showed that the significant trend lay in the 
direction of a general increase in variability 
from session to session. For Group D the trend 
was in the opposite direction, toward decrease 
of variability from session to session. In the 
case of group C the trend was not uniform, 
being significant probably because of the large 
difference between the final group session, 
which was low in variability, and the last 
(individual) session, which had relatively high 
variability. 

In Table 3 the various groups and combina- 
tions of groups are compared with respect to 
their mean variabilities and the significance 
of the differences between them. A comparison 
between groups A and D showed no significant 
difference in over-all mean variability, but 
there was a significant difference in the inter- 
action between groups and sessions. As Table 
1 indicates, for group A the variability from 
session to session increased, whereas for group 
D it decreased. In a comparison of groups B 
and C both interaction and over-all mean 


on 








254 


TABLE 3 


Tae SIGNIFICANCE OF THE DIFFERENCES IN VARI- 
ABILITY OF JUDGMENTS OF THE AUTOKINETIC EFFECT, 
BASED ON ANALYSIS OF VARIANCE 


InTer- OVER-ALL 
ACTION MEAN 
BETWEEN Durree- 
Sequence Groups ENCES 
Geaoups or AND BETWEEN 
ComPpaRrep Tyre or S Sessions Sessions Groups 
Avs.D Neurotic vs.non- [GGG .0S n.s. 
neurotic 
B vs. C Neurotis vs. non GGGI 0S 01 
neurotic 
A and D vs. Mixed IGGG vs. n.s 01 
B and C GGGI 
Aand B vs. Neurotic vs. noa Mixed n.s. 1 
DandC neurotic 


variability were found to be significant. The 
two neurotic groups combined were also com- 
pared to the two nonneurotic groups com- 
bined. There was no significant interaction 
between groups and sessions, but the over-all 
mean variabilities were significantly different, 
the neurotic groups being significantly more 
variable in their judgments than the non- 
neurotic. Finally, the combined trend of the 
[GGG neurotic and nonneurotic groups was 
compared with the combined trend of the 
GGGI neurotic and nonneurotic groups (AD 
vs. BC). There was no significant interaction 
between groups and sessions but the over-all 
group variabilities were significantly different, 
with the IGGG groups being far more variable 
than the GGGI groups. 


DISCUSSION 


On the basis of the present findings we may 
conclude that the neurotic group of patients 
were more variable in the:r perceptual judg- 
ments than the nonneurotic. This variability 
was less affected by group influences in the 
neurotics than in the nonneurotics. As a matter 
of fact, the tendency for the judgments of the 
individuals to converge toward a group norm 
as observed by Sherif (5) was not found at all 
in the neurotic group. It was found only in the 
case of the nonneurotics who began with an 
individual session and then were exposed to 
three successive group sessions. 

These results confirm our original hypothesis 
that individuals beset by neurotic difficulties 
will be less influenced by group forces than 
those who are not. On an empirical basis, one 
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might perhaps have predicted that a bimodal 
type of distribution of reactions by the neurot- 
ics would have taken place. The reasoning 
would be that the neurotic who is withdrawn 
and isolated from the group and whose contact 
with others is minimal would be little in- 
fluenced by the group, whereas the neurotic 
who is anxious about his acceptance by the 
group would strive vigorously to conform to 
group demands. The latter would very likely 
be even more influenced by the group norms 
than the individuals who are less socially 
anxious. Despite our small sample, the results 
are unequivocal. The group influence was 
either absent or even tended to be negative for 
the neurotics, since variability of their judg- 
ments tended to increase with a shift to group 
sessions. And contrary to what might have 
been expected, none of the neuretics could be 
identified as social isolates. Every one of them 
was to a greater or lesser degree a member of 
a 36-patient ward and had established some 
group relationships prior to this study. It may 
be that these group relationships were more 
superficial and casual than is normally the 
case, but no basic difference was apparent. 
However, an increased self-preoccupation was 
clearly observable in these neurotic patients. 
This self-preoccupation may have blunted 
their responsivity to interpersonal interaction 
and thereby reduced the group influence. But 
there was no evidence of psychotic withdrawal, 
and no observable psychotic symptoms among 
the neurotic Ss which could have accounted 
for blunting of responsivity. 

The demonstration in this study of the isola- 
tion of the emotionally disturbed person from 
the group leads one to wonder whether or not 
any attempts to help these suffering individuals 
to integrate more into group activity would 
have any fundamental therapeutic effect. Cer- 
tainly the important group work of Slavson 
(6) with children supports such a notion. It is 
also likely that where individual therapy is ef- 
fective, improved interpersonal and group rela- 
tionships would be discernible. But that 
attempts to increase group integration of 
emotionally disturbed patients could affect 
their underlying difficulties remains to be 
proved. It is evident that what is needed are 
objective methods for evaluating the effects 
upon behavior of various therapeutic pro- 
cedures. Studies such as the present one sug- 
gest that one method which might have possi- 
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bilities ‘s that of group integration and inter- 
action. 


SUMMARY 


A group of ten nonpsychotic psychiatric pa- 
tients in the open ward of a general hospital 
was compared with seven medical patients 
without disabling psychiatric symptoms, with 
respect to the variability of their judgments of 
the autokinetic phenomenon. All Ss were given 
four successive sessions consisting of ten trials 
each. Sessions were varied from individual to 
group, and from group to individual. The re- 
sults indicated that neurotic Ss were con- 
sistently more variable than nonneurotic Ss 
in their judgments and were less affected by 
the group influence. The judgments of one sub- 
group of nonneurotics showed the tendency to 
converge to a group norm found by Sherif (5). 
The present findings further suggested that 
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individuals who first approach an unstructured 
situation in a group will show less difference in 
their judgments than when they first approach 
a similar situation individually and then in a 


group. 
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“MANIFEST” ANXIETY, NEUROTIC ANXIETY, AND THE RATE 
OF CONDITIONING! 


HUBERT SAMPSON anp DALBIR BINDRA 
McGill University 


HE suggestion that the rate of simple 

conditioning (conditioning eyelid re- 

sponse, for example) might be a cor- 
relate of anxiety has aroused considerable 
interest among experimental psychologists 
(1, 2, 4, 5, 6, 7). In most of the recent studies, 
designed to reveal a possible relation between 
anxiety and the rate of conditioning, the 
measure of anxiety used is what has come to 
be known as the Taylor Scale of Manifest 
Anxiety. The usual procedure is to place 
individuals who score high on the Taylor 
scale in one group labelled “anxious,” and 
those who score low into a second group 
labelled “nonanxious.” Following this alloca- 
tion, the subjects (Ss) in each group are ob- 
served individually in a conditioning situation, 
and their rates of conditioning (and subse- 
quent extinction) are determined. The data 
are then analyzed to see if the rates of condi- 
tioning and extinction are different for the two 
groups. Because these studies have failed to 
produce unequivocal results, and since they 
are of considerable theoretical interest, we 
consider it important to seek the reasons for 
the discrepancies, and to reconcile the results 
insofar as possible. The present paper is an 
attempt in this direction. 

The Taylor scale (6) purports to measure 
manifest anxiety. It is made up of a number of 
items, selected from the Minnesota Multi- 
phasic Personality Inventory, which were 
judged by clinical psychologists to reflect 
manifest anxiety. The items deal with gen- 
erally accepted overt “signs” of anxiety, such 
as sweating, restlessness, tenseness, etc., and 
the score (i.e., the number of positive re- 
sponses) is assumed to indicate the degree of 
manifest anxiety. Two forms of the scale have 
been in use. The second (“new”) form was de- 


This study was supported in part by the National 
Research Council of Canada (Grant A-P. 12). 


veloped from an item analysis of the 65 items 
that constituted the original (“old’’) scale. 
Fifty of the most discriminating items were 
selected to make up the new scale (5). 

Taylor (6), using the old form of the scale 
and a simple eyelid-conditioning situation, 
found the anxious group to have a more rapid 
conditioning rate than the nonanxious group. 
Spence and Taylor (5) worked with the new 
form, and also found significant differences in 
amount of conditioning between the two 
groups. However, Hilgard, Jones, and Kaplan 
(2), in one phase of a similar study, did not 
find a significant difference between anxious 
and nonanxious Ss. Finally, Bitterman and 
Holtzman (1), working with conditioning of 
galvanic skin response, also were unable to 
demonstrate any difference in rates of condi- 
tioning and extinction between groups sepa- 
rated on the basis of scores on the Taylor 
scale. 

As Bitterman and Holtzman (1) have sug- 
gested, part of the disagreement in results 
may be due to differences in the ranges of 
scores used to define the anxious and non- 
anxious categories. These ranges are shown in 
Table 1. It is evident that Bitterman and 
Holtzman’s nonanxious and anxious groups 
(ranges: 2 to 11; 12 to 40) were not comparable 
to those of Taylor or Hilgard ef al. (ranges: 7 
and below; 23 and above). But these differ- 
ences in scores mean little unless we know 
what the scores represent. Therefore, before 
attempting to reconcile these findings in terms 
of differences in defining scores, it would seem 
wise to examine the validity of the Taylor 
scale, and to delineate as clearly as possible 
the variable or variables represented in it. 
This was the aim of the following study, 
which is based on the assumption that neu- 
rotics (especialiy anxiety patients) would 
score high on any scale that “truly” measures 
manifest anxiety. 
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TABLE 1 


DEFINITIONS OF “ANxious” AND “NONANXIOUS” 
Groups IN TERMS OF RANGES OF SCORES MADE ON 
TayLor’s SCALE OF MANIFEST ANXIETY 

















RANGE 
NONANXIOUS ANXIOUS 
INVESTIGATOR GROUP GROUP FORM USED 
Taylor (6) 1-7 (lower 9 24-36 (upper 12 Old scale 
percentiles) percentiles) 
Spence & Taylor 9 & below 24and above New scale 
(5) (lower 20%) (upper 20%) 
Hilgard, Jones, & 7 or below 23 or above Old scale 
Kaplan (2) 
Bitterman & 2-11 12-40 Old scale 
Holtzman (1) 
PROCEDURE 


The new 5Q-item form of Taylor’s scale was ad- 
ministered to 51 male, hospitalized, neurotic patients 
at a Veterans Hospital in the Montreal area. Not less 
than six nor more than 15 patients were present at any 
one testing session. The patients were assured that 
their answers to the test items would have no bearing 
on their treatment, and in general, every effort was 
made to obtain truthful responses. Subsequently, each 
patient’s hospital record was studied, and the most 
prominent symptoms present upon admission, together 
with the diagnosis, were noted. 

The same scale was administered to several large 
groups of college students, both men and women. There 
were 223 normals in all. 


RESULTS 


Neurotics vs. normals. The maximum score 
possible on the new Taylor scale is 50. The 
scores of the neurotic group ranged from 2 to 
45; of normals, from 2 to 43. The ranges over- 
lap almost completely. But, as may be ex- 
pected, the means of the two groups are sig- 
nificanily (p < .001) different from each 
other, being roughly 26 (SD = 10.02) for 
neurotics, and 16 (SD = 8.05) for normals. 

Anxiety patients vs. other neurotics. Anxiety 
was a prominent symptom in 27 of the 51 
patients. These 27 “anxiety neurotics” were 
compared with the remaining 24 “nonanxiety 
neurotics” (i.e., those patients for whom 
manifest anxiety was not a prominent symp- 
tom). The biserial correlation between these 
clinical ratings and scores on the Taylor 
scale showed only a chance relation (fp. = 
—.003) between the two variables. This 
finding is consistent with that of Bitterman 
and Holtzman (1) who report no significant 
relation between scores on the Taylor scale 
and clinical ratings of anxiety. 
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However, a closer examination of the dis- 
tribution of scores made by the neurotics re- 
vealed a tendency for the nonanxiety neurotics 
to fall at the two extremes of the distribution, 
and for the anxiety patients to occupy the 
middle range (Q, to Q;) of the distribution. 
To check the statistical significance of this 
tendency, all patients having a score of below 
19 (i.e., below Q,) and above 33 (i.e., above 
Q;) were placed in one group, and the number 
of anxiety neurotics in this group was com- 
pared with the number of anxiety neurotics 
among the patients who scored between 19 and 
33 (i.e., Q: to Qs). The chi-square test showed 
a statistically significant (p < .02) tendency 
for the anxiety patients to fall within the 
middle range of the distribution of the scores 
for neurotics. The symptom other than anxiety 
symptoms that was most frequently mentioned 
in the patients whose scores fell at the two 
extremes of the distribution (below Q, and 
above Q;) was depression. 


DISCUSSION 


1. What exactly do the scores on the Taylor 
Scale of Manifest Anxiety represent? Taylor 
(6), as well as Holtzman, Calvin, and Bitter- 
man (3), assume that, so far as the overt 
signs of manifest anxiety are concerned, the 
only difference between neurotics and normals 
is one of degree. Inasmuch as the Taylor 
scale differentiates between normals (mean = 
16) and neurotics (mean = 26), it may appear 
to be a reasonably valid indicator of anxiety. 
But, according to our results, this validity 
does not hold through the whole range of 
scores. The data on the hospitalized neurotics 
show clearly that extremely high scores are no 
longer associated with manifest anxiety, but 
mainly with depression.? Most of the anxiety 
patients fall within Q,; and Q, (i.e., from 19 to 
33) of the neurotic group. On the basis of 
these findings, we are inclined to believe that 
whereas the scores of from 19 to 33 on the 
Taylor scale are associated with manifest 
anxiety, the scores outside this range do not 
represent manifest anxiety. It would seem 
that a person making a score of between 
roughly 19 and 33 is most likely to be classified 


*The impression of clinical psychologists and 
psychiatrists that anxiety and depression are closely 
allied conditions is irrelevant to this discussion, for we 
are concerned only with the correlates of symptomatic, 
manifest anxiety. 
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as anxious by clinical criteria (though not 
necessarily as neurotic). 

This view is supported by the fact that the 
total range of scores on the Taylor scale is not 
related to clinical criteria of anxiety (1), and 
is contrary to Taylor’s assumption (6) that 
different scores represent different degrees 
of anxiety. We suggest that the Taylor scale 
does not measure the degree of manifest 
anxiety, but can be used only to separate 
those individuals who are likely to be classified 
as anxious by clinical criteria from those who 
are not. Specifically, an individual who 
scores within the range of 19 to 33 shows those 
behavioral signs of manifest anxiety which 
make it most probable that clinically he will 
be judged as anxious rather than as non- 
anxious. 

2. We turn now to the contradiction in the 
results concerning the relation between con- 
ditioning and manifest anxiety. Spence and 
Taylor (5) used the new Taylor scale, and 
selected for their anxious group individuals 
who had a score of 24 or above. Since they 
worked with normal Ss, and the mean score 
for the normals is roughly 16, it is a reasonable 
guess that very few, if any, of their Ss had 
scores above 33. That is to say, almost all 
Ss in the anxious group used by Spence and 
Taylor fell within the range which, according 
to our results, is most closely associated with 
manifest anxiety. Their nonanxious group con- 
sisted of Ss with scores of nine or below, that is, 
outside the 19 to 33 range. Thus the two groups 
were well separated with respect to manifest 
anxiety. Even though we do not have the com- 
parable scores for the old scale, it appears 
certain that the anxious and the nonanxious 
groups in Taylor’s original study (6) were 
equally well separated (anxious: 24-36; non- 
anxious: 1-7). Thus in the two studies where 
the anxious and the nonanxious groups were 
well separated with respect to manifest 
anxiety, the differences between the two 
groups in susceptibility to conditioning were 
also significant. 

In Bitterman and Holtzman’s study (1) 
on the other hand, the scores of the anxious 
group ranged from 12 to 40, violating, at 
both ends, the limits (19 and 33) within which 
scores on the Taylor scale are associated with 
manifest anxiety. This may account for their 
failure to find a significant difference between 
the two groups. The finding that is not con- 
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sistent with this interpretation is that of Hil- 
gard and his collaborators (2). They selected 
20 Ss and divided them into two well-sepa- 
rated groups (anxious: 7 and below; non- 
anxious: 21 and above). But they found that 
in terms of simple conditioning there was no 
difference between the two groups. The dis- 
crepancy between this result and those of 
Taylor is difficult to explain. Hilgard et ai. 
attribute it to differences in the experimental 
arrangements. Equally likely, it may be be- 
cause of the small number of cases used in 
their study. 

Of special relevance to this discussion is the 
series of studies in which rate of conditioning 
was related directly to clinical ratings. Welch 
and collaborators (4, 7) used normal, neurotic, 
and psychotic groups and were able to demon- 
strate differences in conditioning rate within 
and between these groups. Bitterman and 
Holtzman (1) also found a difference in the 
rate of conditioning (and extinction) when 
their groups were divided in terms of clinical 
criteria of anxiety. This suggests that the 
rate of conditioning itself may be a more 
sensitive indicator of anxiety than are scores 
on the Taylor scale. 


SUMMARY 


The validity of the Taylor Scale of Mani- 
fest \nxiety was examined with a view to 
reconciling the contradictory results of the 
studies of the relation between anxiety and 
the rate of conditioning. The scale was admin- 
istered to 51 hospitalized male neurotics and 
to 223 college students. The results indicated 
that different scores on the scale do not repre- 
sent different degrees of manifest anxiety, 
though the scores within a limited range (19 
to 33) are more likely to be associated with a 
clinical diagnosis of “anxious” than are 
scores above and below this range. This 
interpretation helps to reconcile the contra- 
dictory earlier findings concerning the relation 
between conditioning and anxiety. It is sug- 
gested that the rate of conditioning itself may 
be more closely related to differences in mani- 
fest anxiety than are scores on the Taylor 
scale. 
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PREDICTING HOSPITALIZATION OF PSYCHIATRIC OUTPATIENTS 


DONALD R. PETERSON 
University of Illinois 


HERE is a definite clinical need for an 

instrument which can aid in isolating 

from the heterogeneous population 
“psychiatric outpatients,” those cases who 
not only fail to profit from therapy but who 
become even more seriously disturbed as 
treatment continues. One of the patients in 
the present study had killed his wife. Two had 
attempted murder. Six had attempted suicide. 
Treating such cases on an outpatient basis 
seems rather imprudent, and means for pre- 
dicting their behavior should permit wiser 
disposition. More generally, increased accuracy 
in predicting such. an event as hospitalization 
should lead to improvement in the efficiency 
of outpatient treatment from the point of 
view of the patient, his therapist, and society 
as a whole. 

This study constitutes an attempt tc meet 
the need for greater predictive accuracy, but 
it was also designed to provide a partial 
definition of the concept, “latent psychiatric 
illness.” It is ordinarily assumed that any 
patient who develops a personality disorder 
severe enough to require institutionalization 
has, before the manifest outbreak of symptoms, 
certain predispositions to illness. Meaningful 
definition of these predispositional tendencies 
is dependent upon measurement of behavior 
during the period of “latency,” and it is this 
measurement which constitutes the second aim 
of the investigation. 

A survey of research on prognosis reveals 
only one empirical study (4) where hospitaliza- 
tion was employed as a criterion of psychiatric 
outcome, and for it the sample was drawn from 
a population not already in the hospital. 
While the results of that study suggest the 
possibility of predicting hospitalization 
through use of a psychometric test (the 


1 Published with the permission of the Chief Medical 
Director, Department of Medicine and Surgery, 
Veterans Administration, who assumes no responsi- 
bility for the opinions expressed or the conclusions 
drawn by the author. The paper is an abstract of a 
Ph.D. thesis done while the author was at the Uni- 
versity of Minnesota. The constant aid and encourage- 
ment of Dr. Paul E. Meehl is gratefully acknowledged. 


Cornell Selectee Index), general inference is 
limited by the size of the sample (V = 10) 
and the fact that the procedures used are not 
commonly employed in present clinical prac- 
tice. The literature on prognosis from psycho- 
metric data has recently been surveyed by 
Windle (23). The studies examined in that 
review have offered hypotheses for test in this 
investigation; they have served as a fund of 
possible psychometric predictors. A list of 
possible nonpsychometric factors was com- 
piled by examining the studies of Wittman 
(24), Dunham and Meltzer (3), Clark (2), 
Jenkins (9), Kant (11), Fisher and Hayes (5), 
Chase and Silverman (1), Mayer-Gross and 
Moore (16), and Lewis (14). 


METHOD 


Plan of investigation. In most respects, the pro- 
cedure employed in this study follows that outlined by 
Horst (8) for general prediction problems. The follow- 
ing steps have been carried out: 

1. A criterion was defined as admission to a psy- 
chiatric hospital following psychological testing and 
two or more interviews by staff members at a VA 
mental hygiene clinic. 

2. Every case in the clinic file was examined, and 
all those who met the criterion were placed in the 
“hospitalized” category. Certain nonpsychometric 
data and the results of the Wechsler-Bellevue, MMPI, 
and Rorschach were recorded for each member of the 
sample. Hereinafter, these hospitalized patients shall be 
referred to as Group I (WN = 108). 

3. A sample of nonhospitalized patients at the same 
clinic was gathered by going through the psychological 
test file and selecting every tenth card which showed 
that the patient had had all three of the tests considered 
in the study. All subjects (Ss) who had been hospi- 
talized were left in Group I. Again, those cases who were 
interviewed less than two times were excluded, but no 
further restrictions were made. Results of the tests and 
nonpsychometric data were recorded for the remaining 
cases, hereinafter referred to as Group II (VN = 114). 

4. In accordance with a double cross-validation 
design (12, 18), each group was split into two sub- 
samples by alphabetizing the data sheets and sorting 
them alternately into two piles. Four groups were 
consequently formed, specified as follows: Group I-A, 
hospitalized (NV = 54); Group II-A, not hospitalized 
(N = 57); Group I-B, hospitalized (V = 54); Group 
II-B, not hospitalized (V = 57). 

5. Next, a sort of gross item analysis was performed. 
One set of significantly differentiating signs was iso- 
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lated by comparing Group I-A with Group II-A (the 
combination of these two groups shall be referred to as 
Sample A). Similarly, a set of discriminating signs was 
derived by comparing Group I-B with Group II-B 
(this combination shall be referred to as Sample B). 

6. For each sample, the following indices were made 
up: (a) a device consisting only of nonpsychometric 
signs; (b) a device consisting only of signs from the 
MMPI; (c) a device comprising both nonpsychometric 
and MMPI signs; (d) a comprehensive device, including 
all signs which discriminated between groups. 

7. Each index derived from Sample A was cross- 
validated on Sample B; conversely, those derived from 
Sample B were cross-validated on Sample A. This pro- 
cedure resulted in an estimate of the efficiency which 
the lists of signs would probably have if employed with 
new samples. 

8. Groups of false positives and false negatives were 
selected and case histories carefully studied in an effort 
to find ways of improving prediction. 

9. Finally, indices combining data from both samples 
were constructed by including only signs which differ- 
entiated at the .05 level of significance for both samples.* 

Subjects and procedure. All Ss had been patients at 
the Veterans Administration Mental Hygiene Clinic, 
St. Paul, Minnesota, at some time during the period 
from 1947 through 1951. The clinic is available only to 
veterans with a “‘service-connected” psychiatric disa- 
bility. “Service connection,” in this sense, is a rather 
broad concept which includes origin of a neuropsy- 
chiatric disorder, aggravation of such a disorder, or 
“emotional” contribution to some nonpsychiatric disa- 
bility. Nearly all the patients are male, white, and live 
in the vicinity of Minneapolis and St. Paul. Veterans of 
both world wars are included in the sample, although by 
far the majority served only in the second one. Almost 
without exception they had had psychotherapy, 
mainly with psychiatric residents, less often with clinical 
psychology trainees and social workers. All therapy 
was supervised by psychiatric and psychological 
consultants. 

Most of the psychometric tests had been adminis- 
tered, scored, and interpreted by trainees in clinical 
psychology, but scoring of the Wechsler-Bellevue and 
Rorschach was checked in an unusually careful way by 
the chief psychologist. For the Rorschach, the scoring 
system employed most nearly approximates that of 
Hertz (7) for location, form, and specification of popular 
responses. Determinants were scored more like the 
Klopfer system (13) than any other. 

The item analysis. The statistical procedure most 
extensively used was chi square. Calculation was done 
in accordance with the method suggested by McNemar 
(15, formula 87, p. 207) for 2 X & tables. Wherever any 
expected frequency in a 2 X 2 table fell below 10, a 
formula incorporating Yates’s correction (15, formula 
85a, p. 207) was applied. Cutting points were estab- 





2 Katzell (12) has suggested using less stringent 
standards for selection of items in the final scale. His 
procedure was also employed, but the resulting indices 
were found to have only slightly greater discriminating 
power than the shorter ones discussed here. Since the 
shorter indices work nearly as well, are less cumbersome, 
and less subject to shrinkage, they alone are reported. 
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lished as rationally as available literature and common 
sense would permit before the tests were calculated. 
For certain unimodal continuous distributions, the 
t test was employed (10, formula 5.03, p. 74). Where 
heterogeneity of population variance seemed likely, the 
Cochran-Cox approximation method was used (10, pp. 
74-75). Generally speaking, if no population difference 
was suggested by /, the variable was henceforth ignored. 
If, however, the / test did suggest a reliable difference 
between means, chi square was also used, with a cutting 
point approximately midway between the two sample 
means. Only if the latter test indicated significant 
discrepancy was the sign included as a member of a 
predictor set. 

Statistical tests were performed with respect to the 
following nonpsychometric variables: age, present job 
status (employed or unemployed), education, job 
stability (number of jobs held since discharge from the 
service), occupational level, marital status, number of 
children (if married), place of residence (if single, 
widowed, or divorced), parents (living or dead), siblings 
(present or absent), birth order, evidence of broken 
home (arbitrarily defined as dissolution of the home by 
separation of the parents, or death or institutionali- 
zation of one or both parents before the patient was 15), 
number of previous hospitalizations, mention of drink- 
ing as a problem in the intake interview, impressions of 
patient behavior during testing (psychologist’s de- 
scription), and diagnosis (service-connected disability, 
beginning diagnosis at the clinic, and closing diagnosis). 

For the Wechsler-Bellevue, no type of complex 
pattern analysis was attempted. Reviews by Rabin 
(19), Rabin and Guertin (20), and Schofield (22) 
suggested that such analyses were likely to be un- 
profitable. In consequence, only the full scale IQ and 
the relationship between verbal and performance IQ’s 
were examined. 

For the MMPI, the literature on pattern analysis 
(6, 17, 21) suggested several possible discriminators. 
The relationship between the “neurotic” end of the 
profile (Hs, D, and Hy) and the “psychotic” end (here 
defined as including Pa, Sc, and Ma), the relative 
elevation of Sc and Pi, the height of D in relation to 
Hs and Hy, the difference score calculated by sub- 
tracting F from K (T scores), the total number of 
scores over 70, and gross elevations on F and Pd were 
all tested by chi-square methods. 

Tests were made for each of the following Rorschach 
signs and scores: R, emphasis on W, D, Dr, and S, F%, 
F+%, A%, W:M, (H + A):(Hd + Ad), M:ZC, EC, 
FC, CF — andC, FC:(CF + C),M,M+:M—, M:FM, 
Fm, (VIII-X)%, Fe, Fk, FK, FC’, cloud, fire, blood, or 
smoke content, sex, food, positional responses, and 
abstractions. Both original and additional responses 
were counted for the last two signs. For all others only 
the original responses were considered. 


REsvutts? 


Formation of indices. In all, 56 variables 
were examined by chi-square techniques. Of 


3 The basic data for those variables which yielded 
significant differentiation between hospitalized and 
nonhospitalized patients in either Sample A or Sample 
B have been deposited with the ADI. Order Document 
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TABLE 1 


PrepvIcTIVE InNpIcC=s DERIVED FROM 
Cur-SQUARE ANALYSIS 








SAMPLE IN 
Waics Dis- 
CRIMINATION 

OccuRRED 


VARIABLE Weicat 





Nonpsychometric 
Diagnosis Psychosis 

Psychoneurosis, 
mixed or unclas- 
sified, depressive 
reaction, or ob- 
sessive-compul- 
sive reaction 

Any other diagnosis 

Previous hospi- Two or more 
talization One 

None 

Single 

Divorced or 
widowed 

Married 

Mentioned in 
intake interview 

Not mentioned 

Unemployed 

Employed 


Marital status 


Drinking as a 
problem 


Employment 
status 
MMPI 
“Neurotic” 
scores: 
“psychotic”’ 
scores 


Pa or Sc or Ma > 
Hs or D or Hy 
Pa and Sc and Ma 

< Hs and D and 

Hy 
Se > Pt 
Se < Pt 
B > & 

< 0 

Pd > 65 
< 65 
D> 4s or Ay 
D < Hs and Ay 
Four or more 
Less than four 


Se: Pi 


D: Hs and Hy 


-ororocore orf Go 


Number of 
scores over 70 
Wechsler-Bellevue 
Full scale IQ 


— 


< 105 
= 105 


o 


Rorschach 
4% <n 
zw 





these, nine appeared to discriminate between 
hospitalized and nonhospitalized patients in 
Sample A; 11 appeared effective in Sample B. 
Overlap occurred for seven variables. All 
signs were weighted in accordance with ob- 
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Photoduplication Service, Library of Congress, Wash- 
ington 25, D. C., remitting in advance $1.75 for 35 mm. 
microfilm or $2.50 for 6 by 8 in. photocopies. Make 
checks payable to Chief, Photoduplication Service, 
Library cf Congress. A microfilm copy of the complete 
thesis can be obtained by ordering Publication No. 4348 
from University Microfilms, Ann Arbor, Michigan. 


served frequency trends, and the indices 
shown in Table 1 were made up. 

Estimates of predictive and discriminatory 
efficiency. On the basis of the signs listed in 
Table 1, every patient was given four scores, 
one based solely on nonpsychometric data, 
one based only on the MMPI, one obtained by 
adding those two scores, and one based on all 
signs which seemed to show differentiating 
power. Cutting scores were established by 
equalizing false positives with false negatives 
in each derivation sample, and cross-validation 
was carried out by examining the percentage 
of correctly classified cases in the samples 
from which the indices had mot been derived. 
Thus, indices derived from Sample A were 
cross-validated on Sample B; those derived 
from Sample B were tested with Sample A. 
Final indices were formed by considering only 
those signs which discriminated between hos- 
pitalized and nonhospitalized patients in 
both samples. For these indices, percentages of 
accurate classification over the entire sample 
were computed. Results are presented in 
Table 2. 

Investigation of false negatives and false 
positives. In an effort to isolate additional 
variables which might aid prediction and to 


TABLE 2 


PERCENTAGE OF CASES CORRECTLY CLASSIFIED BY 
THE Various INDICES 








AccuRAcY Accuracy 
PERceNTAGE, PERCENTAGE, 
DERIVATION Cross- 
SAMPLE VALIDATION 





Nonpsychometric index derived 70.37 63.21 
from Sample A 
Nonpsychometric 
from Sample B 


Final nonpsychometric index 


index derived 66.04 69.52 


MMPI index derived from Sample 
A 

MMPI index derived from Sample 
B 


Final MMPI index 


Combined index (nonpsychometric 
plus MMPI) derived from 
Sample A 

Combined 
Sample B 

Final combined index 


index derived from 


Comprehensive index (all signs) 
derived from Sample A 

Comprehensive index derived from 
Sample B 
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account for the predictive failures that oc- 
curred, an intensive case study was under- 
taken for certain patients who were incor- 
rectly classified by all devices. A set of false 
negatives was selected by searching Group I 
(i.e., the total hospitalized sample, consisting 
of Groups I-A and I-B) for cases in which 
scores on all four predictive devices were be- 
low the critical level for predicting hospitaliza- 
tion. Eleven individuals who fit that descrip- 
tion were found. The nonhospitalized group 
was similarly examined for patients who were 
inaccurately categorized by all statistical 
methods. Twelve such cases were isolated. 

Telephone conversations with ward secre- 
taries revealed that four of the apparent 
“misses” were spurious. Prediction for them 
would actually have been correct, but did not 
appear so because of clerical errors. Two of the 
false negatives had never been under neuro- 
psychiatric treatment. They had gone to the 
hospital, but not to the psychiatric wards, 
even though the records indicated that they 
had. Conversely, two of the apparent false 
positives were found to have received treat- 
ment in the psychiatric section of the hospital 
after examination and therapy at the clinic, 
but notation of the event was not found in the 
case history (in one instance because the 
author missed it). 

Prediction for the remaining group was 
genuinely wrong. Careful rereading of the 
material in the clinical file and examination of 
all available collateral data suggested that in- 
correct prediction resulted variously from 
patently faulty diagnosis, the operation of 
contingency factors, and omission of individ- 
ually important variables. Inspection of all 
the records yielded only one quantifiable fac- 
tor, out of several examined, which might 
profitably have been included as a general 
predictor. Five of the 10 married patients in 
the hospitalized group were expectant fathers. 
Only one of the five married veterans in the 
set of false positives was in that situation. 
One cannot expect that addition of a “preg- 
nant wife” sign will increase predictive ac- 
curacy to any great extent, but it may offer 
enough promise to warrant further examina- 
tion. 

Analysis of the factors which led to incor- 
rect prediction, however, clearly indicates the 
need for more careful diagnosis and inclusion 
of such nebulous but important factors as the 
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relationship between patiext and therapist if 
prediction is to be materially improved. 

Use and interpretation of final indices. By 
considering those signs in Table 1 which were 
found to be effective discriminators in both 
half-samples, three scores can be assigned to 
any case: one can be derived from his stand- 
ing with respect to the three nonpsychometric 
variables, one can be obtained through exam- 
ination of his MMPI profile, and one can be 
obtained by adding those two numbers. 
Scores above two for the nonpsychometric 
index, above one for the MMPI index, and 
above three for the combined index suggest 
the likelihood of future hospitalization. Use 
of these cutting scores should result in ap- 
proximately equal percentages of accurate 
prediction in the positive and negative direc- 
tions. For a large sample of new cases, however, 
the absolute number of false positives will 
probably be considerably larger than the 
number of false negatives, and cutting scores 
can be altered in terms of the demands of the 
particular situation where prediction is re- 
quired. 

Cautious interpretation is recommended. 
The indices are “final” only in the context of 
this study. The devices, as they stand, have 
not actually been used to predict, although the 
items which make them up have been cross- 
validated with more than usual rigor. Validity 
needs to be checked on other groups, but con- 
sidering the almost infinitesimal shrinkage 
which occurred for each half-sample instru- 
ment on cross-validation, it seems reasonable 
to assume that they will hold up quite well 
with other cases drawn from the same popula- 
tion. 

On the “average,” the person who not only 
fails to profit from outpatient psychotherapy, 
but who actually becomes worse, is single, has 
been previously hospitalized, is likely to have 
a “psychotic” diagnosis, and has a seriously 
disturbed MMPI profile. From these be- 
havior patterns, we ordinarily infer strong 
tension and anxiety, without a healthy, or 
even a neurotic, complement of defenses. The 
mechanisms employed can more accurately be 
described as psychotic; they include with- 
drawal from the world of objective affairs, - 
various distortions in perception of the en- 
vironment, and emotional frigidity. 

In all important respects, the pre-illness 
MMPI profile conforms to the characteristic 
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pattern found in manifest psychosis (17). We 
can probably assume that, at the time these 
patients were tested, psychotic behaviors were 
not clinically apparent in full form; that, if 
they had been, the patients who showed them 
would have been sent to the hospital immedi- 
ately. If this is a safe assumption, predisposi- 
tional tendencies to severe psychiatric illness 
can be defined in terms of the following test 
conditions: (a) clinical observation reveals no 
obvious symptoms severe enough to warrant 
institutionalization, but tendencies in that 
direction may still be present, as evidenced by 
increased likelihood of a “‘psychotic’’ diagnosis; 
(6) examination of biographical data shows a 
history of adaptive failure, as evidenced by 
prior hospitalization and single marital status; 
(c) scrutiny of the MMPI record reveals a 
profile that is different in no essential respect 
from that found in manifest psychosis. 

The heterogeneity of the sample, however, 
reduces the meaning of interpretations of this 
sort, and studies of more specifically defined 
groups are badly needed. To this end, an in- 
vestigation of a group of latent schizophrenics 
is now under way. 


SUMMARY 


This study constitutes an attempt to devise 
simple, widely applicable, and maximally 
precise indices to aid in predicting hospitaliza- 
tion of psychiatric outpatients, as well as to 
formulate a partial operational definition of 
the concept, “latent psychiatric illness.” 

Data were gathered at a VA mental hygiene 
clinic for all patients who underwent psycholog- 
ical examination, were seen for at least two 
interviews by clinic staff members, and later 
were admitted to psychiatric hospitals. In 
all, 108 such patients were found. They were 
compared with 114 nonhospitalized patients 
at the same clinic in terms of certain non- 
psychometric data and the results of three 
psychological tests, the Wechsler-Bellevue, 
the MMPI, and the Rorschach. To minimize 
chance effects, samples were split in accordance 
with a double cross-validation design. Indices 
were derived separately for each half-sample 
and cross-validated on the other. Percentages 
of accurate classification ranged from 63 for 
one of the nonpsychometric devices to 75 
for an index composed of four nonpsycho- 
metric signs, four MMPI signs, and one sign 
from the Wechsler-Bellevue. Generally, predic- 
tion made on the basis of nontest data alone 
or from the MMPI alone would have been 
correct about two-thirds of the time; predic- 
tions based on indices comprising both non- 
psychometric and MMPI signs would have 
been correct for a little less than three-fourths 
of the cases. 

Individual false positives and false nega- 
tives were then selected from the hospitalized 
and nonhospitalized groups. Such patients, 
for whom prediction was wrong in terms of all 
devices, were submitted to intensive case 
study. Clerical errors regarding hospitalization 
had been made in some of the individual! 
records; accurate prediction would in fact 
have been made for four of the 23 apparent 
misses. For the remaining 19, contingency 
factors, inaccurate diagnosis, and omission of 
uniquely important variables seemed to ac- 
count for most of the predictive failures. 

Predictors derived from the subsamples 
were combined into single indices by including 
only those signs which discriminated at the 
0S level of significance between hospitalized 
and nonhospitalized patients in both half- 
samples. Three forms of index were derived, 
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one consisting only of nonpsychometric varia- 
bles, one consisting only of MMPI signs, and 
one comprising both nontest and MMPI fac- 
tors. Choice will depend on the amount of 
data available. 

The findings contain a partial basis for 
definition of predisposition to severe psychiat- 
ric illness. The mean MMPI profile of the 
subsequently hospitalized patients is that of 
manifest, not latent, psychosis, even though 
such deviations could not have been flagrantly 
obvious in other clinical behavior at the time 
of testing. Inference of this nature, however, is 
limited by the heterogeneity of the group, and 
a study of the prepsychotic behavior of a group 
of latent schizophrenics is now in progress. 
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THE PERFORMANCE OF SCHIZOPHRENICS ON SOCIAL CONCEPTS: 
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tendency to view schizophrenia as a 

thinking disorder. This impairment of 
thought processes in schizophrenia has been 
demonstra‘ed in both verbal tasks (1, 19, 21, 
22) and performance tasks (3, 4, 14, 20), and 
has been characterized by some as an inability 
to generalize or to form concepts. 

However, the origin of this impairment 
is as yet open to conflicting interpretations. 
For example, Vigotsky (20) believes that the 
disturbance in concept formation is caused by 
an underlying organic process. Goldstein 
wrote in 1939, “this similarity (in test per- 
formances of organic and schizophrenic cases) 
does not permit the rash assumption that 
schizophrenia is fundamentally an organic 
disease; the great differences in the behavior 
of organic and schizophrenic patients in 
various directions will prevent such conclu- 
sions. However, the similarity points to an 
organic process as cause of this impairment, 
whether this process be primary or of secondary 
origin” (11, p. 583). In a later publication 
(12), however, Goldstein seems to view the 
problem as currently indeterminate. Both 
Goldstein and Vigotsky stress the general 
nature of the defect. Indeed, Goldstein con- 
siders the abstract and concrete attitudes not 
merely as cognitive functions but as “‘capacity 
levels of the total personality, each furnishing 
the basis for all performances on a certain 


I RECENT years there has been a growing 
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plane of reference to the outer world situation” 
(12, p. 18). 

For Cameron (5, 6, 7, 8, 9), the disorganiza- 
tion in schizophrenic thinking is a symptom 
of the patient’s “social disarticulation,”’ 
initially occasioned by defective role-taking 
ability. This isolation from common environ- 
mental influences leads to a progressive sub- 
stitution of asocial fantasy for realistic inter- 
change of attitudes and viewpoints, resulting 
in a gradual impairment of organized, socially 
acceptable thinking. 

It is interesting to note, in view of the above 
conflicting interpretive considerations, that 
there has been no attempt to study systemati- 
cally the test performance of schizophrenics 
on problems involving social concepts, as 
compared to their test performance on prob- 
lems involving formal concepts, i.e., the more 
impersonal, less sociaily-toned type of con- 
cept. A functional-behavioral stress, such as 
Cameron’s, in the problem of schizophrenic 
thinking would imply a selective impairment 
of cognitive functioning, one more apt to be 
revealed to the extent that the content of the 
concept to be formed by the patient in the 
experimental situation is related to his aware- 
ness of the realities of interpersonal situations 
about him. 

The purpose of this experiment is to in- 
vestigate the hypothesis that the social con- 
ceptual performance of schizophrenics is im- 
paired relative to that of a normal group, even 
though both populations are equated on formal 
conceptual performance; or to put it differ- 
ently, that the schizophrenic deficit relative to 
normals is greater on social conceptual prob- 
lems *an on formal conceptual problems. By 
formal concept is meant an abstraction which is 
common to a number of noninterpersonal 
situations, and which is capable of describing 
aspects of such situations in (a) physical, 
psychophysical, or quantitative terms, such as, 
volume, color, number, or (5) logical-relational 
terms, such as concepts based on princi- 


* The terms social concept and formal concept will 
be explicitly defined below. 
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SCHIZOPHRENIC PERFORMANCE ON SOCIAL CONCEPTS 


TABLE 1 


AcE, VocaBuLARY LEVEL, AND EDUCATION OF THE 
CONTROL AND SCHIZOPHRENIC GROUPS 





VocaBu- Epvuca- 
AGE LARY* TION** 


N Mean SD Mean SD Mean SD 


2.0 
2.4 


Group 





7 638 
oF 22 


31 29.4 5.9 
29.8 5.5 


Controls 
Schizo- 31 
phrenics 


1.1 
0.9 





* In Wechsler-Bellevue weighted score units. 
** Years of schooling completed. 


ples of hierarchical classification like species- 
genus, or relational concepts like ‘“middle- 
ness.”’ A social concept is an abstraction which 
is common to a number of situations involving 
people, and which is capable of describing 
aspects of such situations in terms of the func- 
tional relationships exhibited by people in 
their mutual interaction, i.e., cooperation, 
encouragement, courtesy. 

Social concepts are abstracted typically from 
situations in which people are reacting to one 
another. They are derived from such situations. 
It should be pointed out, however, that the 
distinction drawn between social concepts and 
formal concepts is not an absolute one. All 
concepts are social in the sense that they are 
learned and applied in social contexts. Also, 
the function of concepts, social or formal, is 
to provide a tool whereby hitherto unrelated 
situations may be related. What is stressed 
here is the relatively greater application of 
so-called social concepts than of so-called 
formal concepts to interpersonal situations. 


METHOD 


Subjects. The experimental group consisted of 31 
schizophrenic patients at the Brooklyn State Hospitai. 
Each member of ti.is group was matched with a non- 
psychotic control on the basis of age, education, and 
intelligence as measured by the Wechsler-Bellevue 
vocabulary score. Ail subjects (Ss) were male, white, 
native born, and had lived most of their lives in an 
urban environment. Table 1 presents means and 
standard deviations of the two groups for age, educa- 
tion, and vocabulary score. It will be noted that for 
each comparison the groups are approximately equal. 

Only those patients were selected whose diagnoses 
were unanimous and uncomplicated by neurological 
involvement or physical disability. Other criteria 
for selection were the testability and degree of co- 
operativeness of the patient. As a whole, the patient 
population could be characterized clinically, or in a 
Kraepelinian sense, as relatively undeteriorated. The 
median number of months of hospitalization was 11.3 
with 74 per cent of the group having had 2 years.or less 
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of hospitalization. The range was from two months to 
nine years. The subdiagnostic breakdown was as 
follows: 13 catatonics, 12 paranoids, 4 mixed, 1 hebe- 
phrenic, and 1 simple. 

For the control group, 24 Ss were recruited from 
factories and business establishments in greater New 
York City. The remaining seven controls were ob- 
tained from the Vocational Advisory Service of the 
YMCA. Only those clients who were adjudged by the 
vocational counselor as not exhibiting undue emotional 
difficulties were utilized in the study. As a check upon 
the possibility of previous mental illness in the control 
group, each control was asked to enumerate any diseases 
or physical disabilities he had had both as a child and 
as an adult, whether he had spent any time in an institu- 
tion or a hospital for tuberculosis or heart disease, 
and finally whether he had ever had a “nervous break- 
down,” been treated by a psychiatrist, or spent time 
in a mental hospital. None of the controls used in the 
study gave indications of mental abnormality either 
in overt behavior, quality of verbalization, or from 
previous history. 

With respect to occupation, the majority of the 
members of both groups fall into the semiskilled cate- 
gory. It was not possible to make a man-for-man match- 
ing for occupation, but the groups as a whole are fairly 
comparable with respect to this variable. 

Tests. All tests were administered individually by 
the writer in one session. To obtain indices of formal 
conceptualization ability, it was decided to use a verbal 
scale and a performance scale. The ones selected were 
the verbal analogies and picture reasoning scales from 
Jastak’s battery, Psychometric Patterns (16), which 
was designed for clinical use. These tests have been 
standardized on the same normal population and lend 
themselves to objective scoring. Furthermore, the writer 
had used them in a clinical setting for over a year and 
had found them effective with psychotic subjects. The 
verbal analogies test consists of 32 items arranged in 
order of difficulty; each item is in the form, “the hand 
has fingers, the foot has...” The picture reasoning 
test comprises 11 items, also arranged in order of dif- 
ficulty, with each item consisting of five cards. The 
S’s task is to place each set of five cards in proper 
sequence according to a logical-relational concept, e.g., 
arrangement according to the size of the box shown in 
each picture. For these formal concept tests, differential 
credits for each item are assigned, depending upon ac- 
curacy of response and time. 

A preliminary study was carried out before the ex- 
periment was begun in order to devise a social concept 
test with a suitable range of item difficulty and to 
standardize test procedure. The final edition of the 
social concept test consisted of 21 items, the first 18 
items containing four cards each and the last three 
items containing six cards each. In both the four- and 
six-card items, three of the cards depicted instances of 
one social concept, while one card (in the case of the 
four-card items) and three cards (in the case of six- 
card items) represented negative instances. For ex- 
ample, in the case of the “rescue” item, the three con- 
cept cards included (a) a picture of a man rescuing a 
girl from the path of an advancing truck; (5) a scene 
depicting a fireman rescuing a person from a burning 
building; and (c) a picture of a man throwing a life 
belt to a swimmer in distress. For the nonconcept card, 
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there was a scene showing two children on sleds racing 
toward a busy intersection. The S’s task was to separate 
the three cards that had the same idea from those that 
did not have that particular idea, and then to explain 
his sorting. 

It was decided before the experiment proper to 
score the social concept test as simply and as objec- 
tively as possible. As in the case of the formal concept 
tests, differential credits for each item were assigned 
on the basis of accuracy of response (i.e., the sorting of 
the three concept cards) and time. However, when the 
occasion demanded it, the schizophrenics were per- 
mitted to run over the time limit in order to minimize 
frustration and to afford material for qualitative analy- 
sis. An untimed social concept accuracy score was also 
utilized in the analysis of the data. 

Statistical treatment. One test of the hypothesis in- 
volves: (a) statistically equating the picture reasoning 
and verbal analogies scores of the two groups; (5) com- 
puting the corrected difference between the control and 
schizophrenic groups on the social concept test; (c) 
testing the significance of this difference by means of 
the extension of the standard error formula for one 
matching variable given by Peters and Van Voorhis 
(18, pp. 463-469). There are two major assumptions 
which should be met before this statistical treatment 
can be applied. The assumption of normally distributed 
variables was met by converting the raw scores of the 
two groups on the three concept tests into 7 scores, 
i.e., normalized standard scores with a mean of 50 and 
a sigma of 10. The second assumption is linearity of 
regression among the three variables. The scattergrams 
revealed no evidence of curvilinearity but to check the 
point more carefully, statistical tests were performed 
to determine whether the hypothesis of curvilinearity 
of regression could be sustained. The method is based 
on an analysis of variance technique and is described 
by McNemar (17, pp. 255-258). The resulting F’s 
revealed that the hypothesis of curvilinearity of regres- 
sion cannot be sustained. It is not refuted, of course, nor 
does it vrove rectilinearity. However, this evidence, 
coupled with the fact that there is no a priori reason for 
expecting a curvilinear relationship between tests of 
cognitive ability, lends support to the hypothesis of 
rectilinearity of regression. 


RESULTS 


The differences between the control and 
schizophrenic groups on the three concept 
tests are presented in Table 2. The first three 
critical ratios in this table were obtained 
through the application of the formula for the 
standard error of the difference between 
correlated means (17, p. 74). The fourth and 
fifth critical ratios, for the corrected social 
concept scores, were evaluated by means of 
the Peters and Van Voorhis formula men- 
tioned in the previous section. 

It will be seen that all the tests discriminate 
between the two groups in favor of the con- 
trols, that the social concept test shows the 
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greatest disparity between the control and 
schizophrenic groups, and that the difference 
between these groups on the social concept 
scale is highly significant even when differences 
on the formal concept tests are partialled out. 

These results are supported by the findings 
from the simpler technique of comparing the 
proportion of schizophrenics whose social 
concept scores were below the lowest social 
concept score of the controls with the pro- 
portion of schizophrenics whose formal con- 
cept scores fell below the poorest formal con- 
cept scores of the control group. Whereas 21 
of the 31 schizophrenics obtained social con- 
cept scores which were below the poorest social 
concept score of the controls, only nine schizo- 
phrenics secured picture reasoning or verbal 
analogies scores below the poorest control 
scores on the latter scales. The difference 
between these proportions yields a chi square 
for correlated proportions of 10.08, which is 
significant at the .01 level. 

The social concept test was scored also for 
accuracy of sorting only, without taking ac- 
count of time. This untimed accuracy score 
enabled the schizophrenics, who, as a group, 
were slower than the controls, to score rela- 
tively higher than they would with an ac- 
curacy-time score. Nineteen of the 31 schizo- 


TABLE 2 


DIFFERENCES AND SIGNIFICANCE OF THE DIFFERENCES 
BETWEEN CONTROL AND SCHIZOPHRENIC GROUPS 
ON THE VERBAL ANALOGIES, PICTURE REASONING, 

AND SOCIAL CONCEPT TESTS 





STANDARD 

ERROR OF 

MEAN Drr- MEAan Dir- 
FERENCE FERENCE 


TEstT 





1.329 
1.488 
1.636 


3.645 
8.387 
13.806 


Verbal analogies 

Picture reasoning 

Social concepts (unad- 
justed)** 

Social concepts—I (ad- 
justed)** 

Social concepts--II (ad- 
justed)** 


11.945 1.538 


11.613 1.806 





* The differences are all in favor of the controls, are ex- 
pressed in T-score units, and are all significant beyond the .01 
level of confidence. 

** The unadjusted social concept scores are those experi- 
mentally obtained. The adjusted social concept scores have been 
statistically corrected so as to equate the two groups (a) on both 
verbal analogies and picture reasoning (social concepts—I), or 
(b) only on picture reasoning, which was the more discriminating 
of the two formal concept tests (social concepts—II). 
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phrenics obtained untimed social concept 
scores below the poorest social concept score 
of the control group. The difference between 
this proportion and the proportion of schizo- 
phrenics scoring below the poorest control on 
the formal concept tests is significant at the 
01 level (chi square is equal to 9.09). 

We have also been able to match directly 
12 pairs of controls and schizophrenics whose 
formal concept scores (averaged verbal analo- 
gies and picture reasoning TJ scores) were 
exactly equal. This was done in order to make 
a direct comparison of their respective untimed 
social concept T scores without necessitating a 
prior statistical correction for inequality in 
formal conceptualization. The average differ- 
ence between the two groups of matched 
Ss is 9.83 (in favor of the controls), with a 
standard error of mean difference equal to 
1.014, yielding a ¢ ratio for small correlated 
samples (17, p. 226) of 9.69. This ¢ is significant 
at the .01 level for 11 df, and substantiates by 
a direct equation on formal concepts what was 
found by the statistical matching reported 
above. 

Finally, 16 pairs of controls and schizophren- 
ics were matched directly on the timed picture 
reasoning test alone, since this latter scale 
proved to be more discriminating between the 
two groups than the verbal analogies test. 
The mean difference between controls and 
schizophrenics on the untimed social concept 
test was 6.78 (in favor of the controls), with a 
standard error of mean difference equal to 
2.378. The obtained / of 2.85 is significant at 
the .02 level or at the .01 point. The use of a 
one-tailed significance test is defensible since 
the hypothesis presupposed a difference in a 
particular direction, namely, in the direction of 
schizophrenic decrement. 

In each of the above / tests, the hypothesis 
of homogeneity of variance was not refuted by 
the significance level corresponding to the F 
ratio obtained by dividing the variances of 
the schizophrenics’ and controls’ untimed social 
concept scores. 

It may be argued that the social concept 
test may make heavier demands upon ab- 
stractive capacity than the formal concept 
tests, i.e., may require greater generalization 
ability, more planning skill, keener isolation of 
parts, or may make greater demands upon 
symbolic processes (13). If this is so, then the 
schizophrenic decrement on the social concept 
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test could be predicted solely from the hypothe- 
sis of schizophrenic impairment of abstract 
attitude. However, if this is true, then it 
follows that (a) the controls would find the 
social concept test more difficult than the 
formal concept tests, and (5) the schizophrenic 
group would show greater decrement on those 
items of the social concept test which are more 
difficult for the controls than on the easier 
items of this test. As a check upon the first 
deduction, we have compared the number of 
failures per item (total number of failures 
divided by the number of items on the test) 
of the verbal analogies, picture reasoning, and 
social concept tests. The number of failures 
per item were 13.5, 7.2, and 7.8, respectively. 
There is therefore no evidence to support the 
hypothesis that the schizophrenic decrement 
on the social concept test is due to the greater 
intrinsic difficulty of this test. As a check upon 
the second deduction, the items of the social 
concept test were ranked in order of difficulty 
for the control group (percentage passing) and 
in order of intergroup discriminative power. 
The obtained rho was actually negative 
(—.30), implying that there was no tendency 
for the more difficult items to prove more 
potent as differentiators between the two 
groups than the easier ones. 


DISCUSSION 


The results lend support to the hypothesis 
in that they demonstrate a pronounced decre- 
ment in schizophrenic performance on social 
concepts relative to formal concepts. These 
results were predicted from Cameron’s theory, 
but the logic of the prediction bears closer 
scrutiny. An attempt will be made to phrase 
the argument with greater explicitness, and, 
it is hoped, with a consequent increase in 
rigor. The paradigm for the argument is 
borrowed from Cohen and Nagel (10). 

More schizophrenics than normals exhibit 
decrements in conceptualization. These are 
observations or statistical inferences from 
observations. 


Rival Hypotheses 


1. If more schizophrenics suffer from “‘social 
disarticulation,” resulting in an impairment of 
organized, socially acceptable thinking, then 
schizophrenics would tend to exhibit significant 
decrements in conceptualization. (General 
rule—Cameron’s hypothesis.) 
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2. If more schizophrenics than controls 
suffer from a decrement in the “abstract 
attitude,” then schizophrenics would tend to 
exhibit significant decrements in conceptuali- 
zation. (General rule—Goldstein’s hypothesis.) 


Deductive Elaboration 


1. From Cameron’s hypothesis: 

a. More schizophrenics than controls 
suffer from social disarticulation. 

b. Those who suffer from social dis- 
articulation would have greater difficulty in 
the conceptualization of socially toned ma- 
terial than more neutrally toned material. 

c. Therefore schizophrenics would evi- 
dence greater decrements in the conceptuali- 
zation of socially toned material than more 
neutrally toned material. 

2. From Goldstein’s hypothesis: 

a. More schizophrenics than controls 
suffer decrements in abstract attitude. 

b. Those who suffer from abstract atti- 
tude decrement may or may not show selective 
decrement in social concepts. 

The expression “may or may not” refers to 
the fact that it is impossible to deduce proposi- 
tions about social concept decrement from 
Goldstein’s hypothesis with the same degree 
of probability that such an inference can be 
made from Cameron’s hypothesis. Indeed the 
presumption would be that since the abstract 
attitude represents a capacity level of the 
total personality, there would be little selective 
impairment within sizable groups, regardless 
of the type of conceptualization task used, 
provided they are comparable in difficulty. 

It would appear that this decrement can be 
assimilated best into a theoretical framework 
which also stresses the importance of ex- 
perience and learning as determinants of 
cognitive functioning. Irrespective of one’s 
systematic orientation in the field of learning, 
whether one is a cognitive theorist or a re- 
inforcement theorist, it can be agreed that the 
less one comes into contact with a given type 
of situation, the less one will know about it 
and the less one will be able to respond ade- 
quately to it. As Hunt and Cofer put it, “as 
must all thought processes like classifying, 
inducing relations, etc., we should guess that 
the ‘abstract abilities’ assessed by these tests 
involve habitual skills which may either be 
strengthened or weakened by an individual’s 
social experience” (15, p. 1019). Bleuler, 
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despite his recognition of constitutional de- 
terminants, emphasizes that “endowment 
with the capacity to form many sharply de- 
fined concepts is an empty potentiality in the 
absence of rich experience to supply the ma- 
terial for the concepts and their delineation” 
(2, p. 430). Just why the schizophrenic is 
socially estranged cannot be answered. The 
question of asocialization is part of the broader 
problem of socialization, a convergent area for 
the fields of learning and personality, neither 
of which is advanced as yet to the stage of 
individual prediction or even delineation of 
crucial factors. However, if we were to assume 
such a process of social disarticulation, despite 
admitted ignorance of its exact determinants, 
then the obtained selective decrement might 
be considered as a result of schizophrenic 
withdrawal, possibly acting in conjunction 
with, or as a reaction to, a deficit in abstract 
attitude. 


SUMMARY 


A hypothesis was derived from Cameron’s 
view of schizophrenic thinking as a product of 
the social disarticulation of this group, as 
contrasted with Goldstein’s interpretation of 
the defect in schizophrenic thought as the 
result of an impairment of the abstract atti- 
tude. The hypothesis was that schizophrenics 
would exhibit a greater decrement relative to 
normals on a test of social concepts than on 
tests of formal concepts. 

Two formal concept tests (one verbal and 
one performance) and a pictorial social con- 
cept test were administered to 31 schizophren- 
ics and 31 normal controls matched with 
respect to age, education, sex, and vocabulary 
score. The populations were then equated 
statistically and through direct matchings on 
the formal concept scores. 

Significant differences in favor of the con- 
trols were obtained between the two groups 
on both types of test. However, schizophrenic 
decrement on the social concept test proved 
significantly greater than decrement on the 
formal concept tests. This differential decre- 
ment obtained whether timed or untimed 
scores of the social concept test were used as 
dependent variables. 

It is believed that as an explanatory hypoth- 
esis, the concept of an impairment of ab- 
stract attitude is, by itself, insufficient to 
account for the selective schizophrenic im- 
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pairment on the social concept test. The latter 
decrement has been interpreted as lending 
presumptive support to a theoretical position 
which also stresses the importance of social 
withdrawal as a determinant of cognitive func- 
tioning in schizophrenia. 
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PSYCHOLOGICAL PROGNOSIS OF OUTCOME IN THE 
MENTAL DISORDERS! 
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HEN mental disorders are regarded 

W as continuing processes running a 
regular course with definite charac- 

teristics and not as haphazard vicissitudes of 
life, it becomes possible to study the initial, 
middle, and ultimate stages of these processes 
and hopefully to obtain indicators in each stage 
of the probable subsequent course of the disor- 
der. This is the essence of the prognostic prob- 
lem. The personality of the patient, his genetic 
and environmental! background, his family re- 
lationships, vocational and social adjustment, 
in fact all his assets and liabilities at any given 
period, may become important indicators of 
the probable course of his illness from then on. 
The wise clinician has always been aware 
that the characteristics of the patient at the 
time he comes for treatment have an important 
relationship to his chances for eventual im- 
provement, provided the stage of the illness 
can be determined. One able diagnostician, 
Kraepelin, has even proposed for one illness, 
dementia praecox, that diagnosis be based on 
the probable outcome of the disease. Present- 
day clinicians still follow this pattern to some 
extent; they examine disease processes with an 
eye to their eventual outcome and diagnose ac- 
cordingly. But the observations on which these 
prognoses are made are usually impressionistic. 
Whether the psychologist can do any better 
with tests than the clinician with his intuition 
is peruaps debatable, but there is such a 
definite demand for improving prognostic 
criteria that tests are well worth trying. The 
demand for improvement arises in the wake of 
the general finding that modern specific ther- 
apies of both the psychotherapeutic and soma- 
totherapeutic variety seem to have an advan- 
tage over nonspecific therapies, such as “total 


1 Read at the symposium on “Interrelations of Bio- 
chemistry, Psychiatry, and Psychology in Prognosis,” 
American Psychological Association, September, 1951, 
Chicago, Illinois. The preparation of this article was 
facilitated by a grant from the National Institute of 
Mental Health, Public Health Service Department of 
Health, Education, and Welfare. Dr. Windle is now a 
Research Scientist at HumRRO. 


push,” only for the period immediately follow- 
ing the therapy. In long-range follow-up 
studies, no conclusive evidence is available for 
any advantage in favor of specific therapies 
(84). There are two possible explanations for 
the failure of the specific therapies to stand up 
under long-range scrutiny: (a) the specific 
therapies help only those patients who would 
improve anyhow, and (5) each of the specific 
therapies is effective only for a specific type of 
patient. Perhaps, if prognostic tests were avail- 
able to predict the different outcomes to be 
expected for a given patient under each of the 
specific therapies, better results than those af- 
forded by nonspecific therapies would follow. 
In either case, prognostic information seems 
highly desirable. Consequently, psychological 
tests have been employed with increasing fre- 
quency in order to provide more standard in- 
dications of probable outcome, and this has 
made possible confirmatory studies and sta- 
tistical evaluation of the reliability and valid- 
ity of the measures. 

Despite the great increase in interest in prog- 
nosis today, most studies have preferred to 
follow the pattern laid down by the earlier diag- 
nostic studies rather than blaze new trails 
which would lead more directly to the goals of 
prognosis. There are some problems common 
to both diagnostic and prognostic studies, such 
as, (a) procedures for establishing validity of 
instruments; (5) classification of patients into 
homogeneous categories such as age, sex, etc.; 
(c) proper statistical treatment of the data. 
Among the problems peculiar to prognosis are: 
(a) proper design of prognostic study; (5) spec- 
ification of therapeutic agent or type of treat- 
ment to which patient is to be exposed; (c) 
duration of illness at time when prognosis is 
made; (d) stage of illness at which prognosis is 
made; (e) duration of follow-up; and (f) criteria 
for evaluating outcome. We shall deal with 
each of these difficulties in turn. 

The psychological tests used to evaluate 
functioning of patients cannot now be validated 
directly but must depend for their ultimate 
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valididation on interviews and direct observa- 
tions of patient behavior. In contrast with 
physiology, which also evaluates functioning, 
psychology has no anatomy to which the ob- 
served behavior can be related. The so-called 
structural elements portrayed in Freud’s 
anatomy of the mind, Lewin’s “differentiated 
regions of the life space,’”’ Rank’s will, Mc- 
Dougall’s sentiments, Stern’s self, Spearman’s 
and Thurstone’s factors, and the personality 
structure conjured by Rorschach and projec- 
tive technique workers, are little more than an 
assemblage of abstractions derived from de- 
scriptions of behavior, rather than independent 
constructs amenable to direct measurement. 
Perhaps the neurophysiologists will some day 
perfect models of brain function which may 
serve as a physiological substratum for percep- 
tion, thinking, and action, but such models are 
still in the future. Until that time, the only 
possible way to validate psychological tests is 
through direct observation of behavior and 
interviews. 

Prognostic studies, unless well designed and 
appropriately implemented, are of little value. 
Too often the prognostic aspects of a study are 
introduced as an afterthought when the major 
aim of the study has failed to materialize. In 
order for a prognostic study to be scientifically 
acceptable it must be planned for, and must 
satisfy certain elementary criteria in its pres- 
entation of the data. Of primary importance, 
of course, is the specification of the particular 
pathological condition under investigation, 
including both the specific disease and the 
condition of the patients with regard to age, 
sex, attitude toward illness, etc. Also important 
are the form of therapy and the time relations 
in the prognosis. It is well known that the 
stage of development of the illness is relevant 
to prognosis. The outcome immediately after 
therapy is often at variance with the outcome 
after several years, and for this reason the 
period of follow-up must be specified. One 
might think that these elementary require- 
ments of scientific reporting are so self-evident 
that they hardly need mentioning. Unfor- 
tunately, many of the studies in the literature 
fail to provide these crucial bits of information. 
The use of suitable statistics for evaluating 
outcome is, of course, essential in prognostic 
studies. Nevertheless, some studies in this 
field suffer from the inadequate use cof sta- 
tistics, many being content with an impres- 
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sionistic comparison of central tendencies. 
Furthermore, most of the prognostic studies 
today are based on hindsight rather than fore- 
sight. It is to be hoped that as more experience 
is gained, a few hardy souls will attempt truly 
predictive studies. 

Another difficulty that hampers prognostic 
studies is the absence of uniformly acceptable 
criteria of improvement against which prog- 
nostic tests could be validated. Such terms as 
“cured” and “recovered,” appropriate as they 
may be in acute physical diseases like appendi- 
citis, are not at all applicable to chronic ill- 
nesses like schizophrenia. The goal of therapy, 
which is fairly well defined in the acute physical 
diseases—to restore the patient to his pre- 
morbid status—must be readjusted to a lower 
level of aspiration in the case of chronic ill- 
nesses (82). As a realistic basis for the evalua- 
tion of outcome of therapy one must still 
resort to such crude measures as being in or 
out of the hospital. 

In view of the above difficulties, the only 
method of ascertaining whether any relation- 
ships exist between test performance and sub- 
sequent outcome was to search for possible 
consistencies in the various studies despite 
their shortcomings. Previous reviews of the 
literature on this topic (16, 18, 64, 66) have 
been of limited scope and have not attempted 
a thoroughgoing resolution of the conflicting 
claims made for the various tests. Conse- 
quently, a more complete review (78) was 
undertaken by one of the present authors to 
determine the validity of the prognostic claims 
for each of the psychological tests used. 

It was found that most of the reports of the 
prognostic efficacy of psychological tests were 
concerned with three techniques: the Ror- 
schach, the Minnesota Multiphasic Personality 
Inventory (MMPI), and a variety of intelli- 
gence tests. Each of these techniques had been 
claimed to be effective in predicting outcome, 
but in each case several] investigators failed to 
confirm the claims. In addition, a wide variety 
of specific indices had been claimed to be pre- 
dictive in a given test, and there was con- 
siderable diversity in the prognostic value or 
direction of any particular sign or index. This 
discordance can best be illustrated by pre- 
senting summaries of the findings employing 
these specific psychological tests. 

The review of the literature disclosed 22 
studies in which the Rorschach technique was 
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investigated as a prognostic agent in the psy- 
choses.? All but seven (18, 29, 60, 61, 63, 66, 
76) of these studies reported the Rorschach to 
have predictive efficacy. Furthermore, most of 
the studies claimed that in general the more 
capable patients or those who were most nearly 
normal were more likely to have favorable 
outcomes. In spite of this agreement in the 
over-all interpretation of the Rorschach proto- 
cols, the empirical signs proposed for predic- 
tion exhibited little consistency. 

For example, one outstanding exponent of 
the prognostic use of the Rorschach, who has 
contributed eight papers to this field (38, 50, 
54, 55, 56, 57, 58, 59), has proposed in each of 
seven different experiments a different set of 
signs or indices, often varying considerably 
from the signs reported in previous investiga- 
tions. It appears that not one of the indices 
which was found to have prognostic power 
retrospectively, maintained this power in a 
new sample. The same inconsistency that 
occurs in the empirical findings is also present 
at the level of interpretation. Thus, running 
parallel with the seven different groups of 
prognostic indices are three slightly different 
interpretations of the personality patterns 
which are claimed to differentiate outcome 
groups (55, 57, 58). One can sympathize with 
the strivings to improve the test results, but 
one must also realize that the continued change 
of indicators may reflect the inappropriateness 
of the selected signs. Furthermore, the multi- 
plicity of possible indices afforded by the 
Rorschach guarantees that some pseudosignifi- 
cant differences will always present them- 
selves. 

Another example of the difficulties of the 
Rorschach technique is provided in a follow-up 
study by Sloan (73) of mental defectives dis- 
charged on wage placement. Seven criteria 
which had been suggested by another well- 
known Rorschach expert as prognostic for 
good adjustment were computed for contrasted 
outcome groups. Although the Rorschach 
criteria for good adjustment adequately differ- 
entiated the outcome groups, the predicted 
outcome was the very opposite of that which 
actually occurred. Those mental defectives 
who were successful in staying out on wage 
placement had a reliably smaller number of 
agreements with the criteria for good adjust- 

? These 22 studies included 18, 20, 22, 29, 30, 38, 40, 
47. 49, 50, 54, 55, 56, 57, 59, 60, 61, 63, 66, 67, 75, 76. 
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ment than did those who were returned as 
failures. 

Additional difficulties in accepting the claims 
for the prognostic value of the Rorschach arise 
from the inadequate evaluation of the results 
of experiments in this field. Only six of the 
studies making prognostic claims with psy- 
chotics (40, 47, 54, 56, 59, 67) have utilized 
statistical methods to test significance, and 
two of these (47, 54) omitted the use of Yates’s 
correction in the chi-square test with small 
samples. The best clinical verdict one can give 
regarding the prognostic claims of the Ror- 
schach technique is: not proven. 

It might be expected that studies of the 
prognostic value of the MMPI would show 
more consistency than those of the Rorschach. 
The scoring of this test is entirely objective, 
there are fewer standard indices that might 
appear to have prognostic value so that pure 
chance results play a smaller role, and less 
retrospective manipulation of the indices in 
the form of fractions, ratios, and weighted 
scores is possible. Consequently, it is to be 
expected that most of the MMPI studies would 
be more likely to establish convincingly the 
prognostic value of the test. It is, therefore, 
disappointing to discover the large amount of 
disagreement in the MMPI findings. In about 
one third of the studies high scores on specific 
scales were thought to signify good outcome 
for psychoses, in another third high scores indi- 
cated poor outcome, and in the remaining 
third no prognostic value was found. Thus, 
high scores on the schizophrenia scale, the one 
most frequently cited, were thought by Harris, 
Bowman, and Simon (26), Hales and Simon 
(21), and Carp (8) to be favorable for outcome 
in the psychoses, while Harris (24), Pearson 
(51), and Feldman (16) felt high schizophrenia 
scores were unfavorable. A similar situation 
existed with the other MMPI scales, high 
scores sometimes being claimed indicative of 
good outcome, but as often found to be corre- 
lated with poor outcome or of no prognostic 
value. 

It is clear that such evidence is not very 
helpful in providing a basis for prediction. It 
is quite likely that some of the studies provide 
more valuable evidence than others, but unless 
statistical estimates of the validity of the find- 
ings are provided, it is difficult to make a 
distinction between the good and the poor 
studies. Perhaps of even greater importance 
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is the paucity of confirmatory studies with 
new samples. Such confirmation has been 
attempted in four instances with the MMPI, 
but only two of these attempts proved success- 
ful (16, 52). One of the others failed to specify 
the criteria of outcome (26), while the second 
claimed validation on the basis of insignificant 
data (51). Again, the most accurate evaluation 
of the prognostic efficacy of the MMPI must 
be: not proven. 

Studies of the use of ability tests for prog- 
nosis also fail to yield a consistent basis for 
prediction of outcome. Of 23 reports of the 
prognostic value of various psychomotor tests 
with psychotics, 10 indicated that favorable 
outcome correlated with efficient performance 
before therapy (6, 9, 20, 33, 38, 42, 49, 67, 77, 
83). On the other hand, five reports concluded 
that inefficient performance was prognostically 
favorable (35, 43, 44, 48, 84), three reported 
either a curvilinear relationship or contrary 
trends under different conditions (36, 68, 79), 
and five found no relationship between intelli- 
gence tests and outcome (19, 26, 53, 60, 74). 
Such a summary of the results of the various 
ability tests may do injustice to the particular 
types of tests employed or special abilities 
studied. There is little evidence, however, that 
any particular test is consistently more effec- 
tive for prognosis than another, or that the 
nature of the test can account for the divergent 
conclusions found in the literature. Again, the 
verdict is: not proven. 

We have found, then, that with each of the 
three techniques, the Rorschach, the MMPI, 
and tests of ability, there is a large amount of 
disagreement concerning both the signs to be 
regarded as prognostic and the direction of the 
outcome to be predicted from any particular 
sign. 

To some extent specific findings seem to be 
attributable either to the diagnosis of the 
patients or apparently to the typeof treatment. 
For example, Malamud and others (43, 44) 
have reported that involutional psychotics of 
estimated inferior intelligence have better 
prognoses than do those of above average 
intelligence. Studies by Schnack, Shakow, and 
Lively (68) and by Zubin and Thompson (83) 
have indicated that different prognostic signs 
may apply to patients given insulin than to 
patients given metrazol therapy. Numerous 
investigators (59, 68, 85) have also pointed out 
that the duration of the follow-up period may 
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also play a role in determining the significance 
of the prognostic signs. 

It is difficult to try to account for the di- 
vergent conclusions of the prognostic studies 
on the basis of any general characteristics, be- 
cause relatively few investigators specified the 
important conditions which characterized their 
experiments. Thus, of 28 prognostic studies 
with the Rorschach, 19 (all but 20, 22, 38, 40, 
47, 56, 59, 66, 76) did not specify the duration 
of illness of the patients and seven (30, 34, 54, 
55, 63, 67, 75) failed to state the time of fol- 
low-up.* Of 10 MMPI prognostic studies (8, 
15, 16, 21, 24, 26, 50, 51, 52, 69), five reported 
that the prognoses applied to the broad diag- 
nostic category of “psychiatric patients” (15, 
16, 24, 26, 52), only about one third specified 
the duration of illness of their patients (8, 21, 
26), and about one third did not indicate the 
duration of the follow-up period (8, 24, 69). 

None of these three factors, diagnosis, type 
of therapy, or duration of follow-up, seems to 
be able to account for all of the discrepancies 
in the literature. Apparently some other factor 
must be found to account for these contrary 
trends. The first question that needs to be 
answered is: Do the more efficiently function- 
ing or the less efficiently functioning patients 
have the better outcome or is there no relation- 
ship at all? 

In several instances in which inefficient pre- 
treatment test performance was correlated 
with good outcome, it was observed that the 
majority of the patients under study were 
chronically ill (26, 35, 43, 44, 48, 84). It was 
thought that, since duration of illness is well 
known to be an important factor in prognosis, 
it is highly probable that radically different 
prognostic signs would apply to patients differ- 
ing in duration of illness. To test this hypothe- 
sis, data were gathered for both chronically ill 
psychotics (long duration of illness) and 
acutely ill psychotics (short duration of illness) 
on the Complex Reaction Time Test (79), a 
test which in essence provides a measure of the 
patient’s ability to react quickly and appropri- 
ately to complex tasks presented continuously. 

Comparisons of the relationships between 
initial CRT scores and final outcome status 
revealed divergent trends; initially, the im- 

3 These 28 studies include 7, 11, 18, 20, 22, 29, 30, 


34, 38, 40, 47, 49, 50, 54, 55, 56, 57, 59, 60, 61, 63, 64, 
66, 67, 70, 71, 75, 76. 





JorsPH ZUBIN AND CHARLES WINDLE 


(_) Laproved 
Me Urproved 














Low 


ACUTELY fll 


CRITICAL 














CHROMMALLY LiL 


Fic. 1. Taz RELATION BETWEEN SCORE ON THE CRT Test AnD OvTcomME ror AcuTE (EARLY) AND CHRONIC 


PATIENTS 


proved among the acutely ill patients had the 
higher scores while the improved among the 
chronically ill had the lower scores. The di- 
vergence between these relationships was found 
to be significant at the .05 level when tested 
by analysis of variance. Even more striking, 
individual analyses revealed that a cut-off 
score of 30 was significantly differentia! for 
both groups of patients, but in opposite direc- 
tions. 

An individual analysis of the scores revealed 
that in the chronically ill, none of those who 
later left the hospital had scores of 30 or more, 
whereas half of those who were still in the 
hospital at the time of the follow-up had 
scores above this critical point. In the acutely 
ill, this same critical score was found signifi- 
cantly differential in the reverse order. In this 
case, all of the improved fell above the critical 
score of 30, while of the unimproved about one 
third fell below that point. 

This study indicates that a possible explana- 
tion of the contradictions in the literature deal- 
ing with the use of psychological tests for prog- 
nosis may be the neglect of the factor of 
chronicity. Because relatively few studies have 
made a point of specifying this factor, it is 
hard to determine whether or not this variable 
can satisfactorily account for all the observed 
differences in results. But it can be seen to be 
important in several cases, especially for those 
studies using ability tests or the MMPI. 


Apparently most of the studies employed 
either groups of acutely il] psychotics or groups 
of psychotics of various degrees of chronicity. 
Since patients of longer durstion typically 
have poorer prognoses and perform less effi- 
ciently than acutely ill patients, it is to be ex- 
pected that a positive relationship between pre- 
therapy performance and outcome would be 
found when the group under study consists of 
a mixture of chronically and acutely ill pa- 
tients. Many of the studies reporting an inverse 
relationship between performance level and 
outcome may have employed chronically ill 
patients only. The basis for the negative cor- 
relation between test score and outcome ix: such 
patients will be discussed later. 

The findings that different prognostic signs 
apply to metrazol and insulin therapy (68, 83) 
may also be accounted for on this basis, since 
the more severely ill patients were more likely 
to be treated with metrazol and the less 
severely ill with insulin. It must be remem- 
bered, however, that all of the conflicting con- 
clusions found in the literature cannot be ex- 
plained on this basis alone, especially since our 
studies have dealt only with psychotics. Fur- 
thermore, it is not to be expected that all con- 
flicts would be resolved with the control of 
merely this one factor, since other factors are 
undoubtedly of importance even if we have 
not as yet identified them and their contribu- 
tions to prognosis. 
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There is little difficulty in accounting for 
the prognostic findings with acutely ill psy- 
chotics since it is to be expected that patients 
capable of efficient performance would tend to 
have the better outcome. For the chronically 
ill, however, it is contrary to common sense 
to predict that psychotics in whom the devia- 
tion from normal efficiency is relatively great 
should have better outcomes than do psy- 
chotics less altered in performance ability by 
the disease process. Consequently, it was de- 
sired to study more thoroughly the 48 chroni- 
cally ill Columbia-Greystone patients (45) to 
verify the prognostic value of inefficient per- 
formance, and, if possible, to specify more 
clearly the nature and extent of this ineffi- 
ciency (85). 

Of the 44 pretreatment measures available, 
it was found that five differentiated between 
criterion outcome groups of 38 patients who 
had been cooperative enough to yield repre- 
sentative data on more than a third of the 
tests. In each case inefficient performance 
presaged favorable outcome. In general, it 
appeared that on conceptual tests those pa- 
tients who subsequently improved functioned 
less well than those who did not improve, 
but on perceptual tests the former functioned 
somewhat better. 

Type of therapy was unrelated to either out- 
come or prognostic indices, but, interestingly, 
was related to the psychiatric ratings of prog- 
nosis made at the time of testing. Patients who 
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were expected to improve were found signifi- 
cantly more often than the others to have had 
brain operations, either immediately or during 
the follow-up period. The psychiatric prognos- 
tic ratings were found to be significantly related 
to outcome when all the patients were con- 
sidered, regardless of testability, but were not 
found to be related in the cooperative cases 
alone. 

This brings us to a discussion of the 10 un- 
cooperative patients. All but one of these pa- 
tients had an unfavorable outcome and a 
diagnosis of hebephrenic schizophrenia. (This 
one relation was, incidentally, the only connec- 
tion found between diagnosis and outcome.) 
Apparently, it was on the basis of these 
deteriorated cases that the psychiatric prog- 
noses achieved their relation with outcome in 
the entire group. Furthermore, these patients 
are apparently even less efficient than are 
those who later improve. We are reminded of 
Langfeldt’s (36) conclusion that both clever 
patients and intellectually debilitated patients 
have catastrophic outcomes. Apparently the 
same situation exists in our study. Chronic 
patients who performed relatively efficiently 
and those from whom no representative per- 
formance could be elicited had remained con- 
tinuously in, or had been permanently returned 
to, the hospital during the more than three 
years since testing. The hypothesized relation- 
ship between levels of performance and the 
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stages of chronicity can best be shown graphi- 
cally. 

Most patients may be expected to function 
somewhat below the normal in psychological 
tests, this decrement increasing with the 
chronicity of illness. In general, acutely ill 
patients function better than the more chron- 
ically ill and have better prognoses. Within 
the chronic group the better functioning seem 
to have rather poor prognoses, those of inter- 
mediate functioning have relatively good prog- 
noses, and those who seem deteriorated have 
very poor prognoses. 

That even such global techniques as the 
Rorschach might prove effective as prognostic 
agents when the protocols they afford are 
investigated systematically and objectively 
without the restrictions imposed by orthodox 
scoring and interpretation is demonstrated by 
McCall’s study (40) of the preoperative proto- 
cols of the Columbia-Greystone patients (45). 
Utilizing a system of psychometric scales for 
scoring the responses (80), he found that of the 
35 scales which were applied, five showed a 
significant prognostic value in retrospect at the 
.05 level.‘ Two scales in which high scores 
might be regarded as reflecting perceptual 
clarity were positively correlated with out- 
come in a four-year follow-up (surface color 
and perception of plant life); three scales in 
which high scores might be regarded as re- 
flecting conceptual clarity were negatively re- 
lated to outcome (dehumanization tendency, 
popularity of percept, reaction time). On the 
basis of this retrospective study it may be 
tentatively concluded that the perceptually 
clear but conceptually confused patients had 
a better prognosis than the perceptually con- 
fused but conceptually clear patients. It should 
also be noted that not one of the orthodox 
scoring categories proved to be prognostic, 
though some derivatives of the orthodox sys- 
tem were found to have prognostic value when 
scaled psychometrically. 


SUMMARY 


A review of the literature on the use of psy- 
chological tests for prognosis revealed an ex- 
treme amount of variation and contradiction 
in the indices considered prognostic. In an 


* The obtaining of five significantly differential scales 
out of 35 at the .05 level of confidence has the proba- 
bility of occurring by chance of .03 if the scales are as- 
sumed to be independent. 
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attempt to partially explain these contradic- 
tions, the factor of chronicity was investigated 
in psychotics. The prognostic index for chron- 
ically ill psychotics was found to be the op- 
posite of that for acutely ill psychotics on the 
Complex Reaction Time Test. Further analysis 
of the preoperative data of the chronically ill 
psychotics revealed a certain amount of gen- 
erality for the finding that those who func- 
tioned well on psychological tests had a poor 
outcome. This prognosis applied regardless of 
therapy, but was limited to the cooperative 
patients. The uncooperative or deteriorated 
patients were almost always diagnosed as 
hebephrenic schizophrenics and had a very 
poor expectancy of remission. There was, then, 
a curvilinear relationship between performance 
and outcome in the chronic patients, those 
performing best and those unable to cooperate 
having a less favorable outcome than patients 
of low-level efficiency. 

It is important to emphasize that these 
findings have been exploratory and require 
validation from confirmatory studies before 
confidence can be placed in the conclusions. 
The fact that we have found support in the 
literature for our findings is of relatively little 
importance since so many different claims have 
been made that almost any hypothesis can find 
considerable agreement. It is, then, essential 
to confirm or refute our hypotheses with addi- 
tional patients in whom more factors can be 
controlled than was the case in our studies. 
This need for confirmation applies to most of 
the studies of psychological prognosis. 

Although it may be premature to speculate 
on the implications of our hypotheses concern- 
ing different prognostic signs for patients differ- 
ing in chronicity, it is not too soon to point out 
some of the criteria for a good prognostic 
study. First, it is important to control and 
specify the conditions of the study. Among 
those which may be important are: (a) diag- 
nosis, (b) type of therapy, (c) duration of 
illness, (d) stage of disease, (e) time of follow- 
up, and (f) opportunities for, and criteria of, 
outcome. Second, what is most needed is con- 
firmation or refutation of existing claims of 
prognostic efficiency rather than new claims. 
Perhaps studies designed to test the range of 
conditions within which particular signs are 
prognostic would most satisfactorily establish 
the validity of previous claims. Last, the 
evaluation of the results should be statistically 
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acceptable, in contrast to the majority of 
present-day reports of prognostic psychological 
tests. 

When investigators begin to apply some of 
these criteria, it may be possible to place some 
degree of confidence in the reports of prog- 
nostic efficacy for psychological tests. 
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INDIVIDUAL CONFORMITY TO ATTITUDES"OF 
CLASSROOM GROUPS! 


WILBERT J. McKEACHIE 


University of Michigan 


N INDIVIDUAL’S attitudes are influenced 
A: some extent by the groups of 
which he is a member. Evidence for 

this statement dates back to Moore’s experi- 
ments in 1921, which demonstrated that 
people will reverse their judgments if told 
that they differ from those of the majority 
of the group. In a similar experiment Marple 
(7) found, on a measure of attitude toward a 
number of controversial issues, that his 300 
high school seniors, college seniors, and adults 
made over half of the possible changes of 
attitude toward the majority attitude as 
compared with about 15 per cent made by a 
control group not told the majority attitude. 
Both of these experiments provide evidence to 
support the hypothesis that the individual 
tends to adopt attitudes corresponding to 
those held by the majority of the group. This 


tendency has usually been termed “conform- 


ity.” 

The perceived group norm. What is the in- 
dividual conforming to? Most investigations 
of the relationship of attitudes to group norms 
have dealt only with conformity to the group 
norm as perceived by the experimenter (Z). 
For example, Marple (7) found that some 
members of his group changed their attitudes 
toward those designated as the attitudes of 
the majority of the group, while other group 
members did not change their attitudes in 
this direction or changed away from the atti- 
tudes of the majority. Why did some indi- 
viduals change in one direction, some in 
another, and some not at all? 

We might improve our ability to answer this 
question if we knew what group members 
perceived the group norm to be—both at the 
time of the pretest and at the time of the 
posttest. It seems probable that even before 
the opinion of the majority was announced, the 

This article is based upon research conducted for a 
Ph.D. dissertation completed in 1949 and carried out 
under the direction of Professor Donald G. Marquis. 


Drs. Harold Guetzkow, Everett Bovard, and Mr. Lee 
Danielson assisted in various aspects of the research. 
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group members had some vague perception of 
the opinion of the majority. We can then ex- 
plain failure to conform thus: not all members 
of the group had the same perception of the 
group norin when the pretest was given. When 
confronted with what purported to be the 
group norm, some group members saw that it 
lay in one direction from their original per- 
ception? and changed their attitudes accord- 
ingly; others saw that it lay in another direc- 
tion from their original perception and hence 
changed their attitudes in the opposite direc- 
tion from the first group. Those who did not 
change may have found that the announced 
norm was in accordance with their original 
guess, or else failed to believe E when he 
announced a different one. Hence, what Marple 
discovered may not have been the “prestige 
value” of majority opinion, but rather the 
effect of a change or lack of change in a per- 
ceived group norm. 

A study of the factors involved in conformity 
will, then, be more definite if one studies con- 
formity to the perceived group norm, rather 
than to a norm perceived by E. We shall call 
this relationship between the individual’s 
attitude and his perception of the group norm 
congruence, retaining “conformity” as the 
term referring to the relationship of attitude 
to the objective group norm (see Fig. 1). 

The reference group—a matter of degree. 
Newcomb (9) has extended our question, 
“what is the individual conforming to?” by 
introducing the term “reference group.” Ac- 
cording to his hypothesis, the individual tends 
to conform to the norm, not necessarily of the 
groups of which he is a member at a given 


2 This theory assumes that when an invididual takes 
an attitude test in a group, the group acts as a con- 
straint upon him even if the group norm exists for him 
in only a vague way. An experiment by F. H. Allport 
(1) demonstrated that an individual’s judgments in a 
group were less extreme than his judgments alone. This 
indicates that the group norm exists for individuals, at 
least in some vague form, even when it has not been 
made explicit. 
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Fic. 1. VARIABLES INVOLVED IN CONFORMITY 
EXPERIMENTS 


moment, but to the norm of a group to which 
he refers his attitude. 

When is a group a reference group? The 
usual solution is to use an all-or-none classi- 
fication. Either the individual is a member of 
the group or he is not a member; either he 
uses a group as a referent for his attitude or 
he does not use that group as a referent. 

It seems probable, however, that a number 
of groups may be influencing an individual at 
any moment in time. Thus, two individuals 
who refer to the same group may be influenced 
to a different degree or in different directions 
by that group; therefore, quantitative meas- 
ures of the individual’s group membership are 
needed. Those who have discussed group iden- 
tification or similar concepts (e.g., Krech 
and Crutchfield [5], Festinger [4], and New- 
comb [10]) have indicated that group member- 
ship, group identification, or group belonging- 
ness are not simple all-or-none concepts. 
Festinger, for example, defines “cohesion” as 
“attraction of the group,” and has evidence 
that the more cohesive the groups the greater 
the conformity to the group norms. Does this 
hold true for norms not directly related to 
group goals? Do the individuals who are most 
attracted to the group show greater con- 
formity? 

Conformity and group process. In any dis- 
cussion of the influence of groups upon the 
individual’s behavior, Lewin’s classic group 
decision experiments (6) inevitably come 
to mind. Lewin and his associates found that 
women who had held a discussion and then 
raised their hands to indicate that they would 
serve certain desired foods did serve the foods 
to a much greater extent than women who had 
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simply listened to a lecture inducing them to 
serve the foods. Other experiments showed that 
in bringing about changes in behavior, this 
procedure, which Lewin called group decision, 
was also superior to discussion without de- 
cision or to individual instruction. 

Group decision evidently is an effective 
method of influencing behavior. Conformity 
to the group norm is high. But even though no 
norm was announced, conformity to the group 
norm of not serving the desired foods was 
even higher in the lecture groups. Obviously 
more research is needed to isolate the effects 
of differing group procedures upon conformity. 
Does the discussion before the group decision 
promote greater tolerance for deviation from 
the norm, or does it help in mobilizing group 
pressures toward uniformity? 


HyYPpoTHESES 


In order to fill some of the gaps in our know}- 
edge about conformity, an experiment was 
devised to test the following hypotheses: 

1. Attitude shifts of group members are 
positively correlated with changes in their 
perceptions of the group norms. (A group 
member’s perception of the group norm is 
defined as his estimate of the attitude of 
“most of the group.’’) 

It should be pointed out that the confirma- 
tion of this hypothesis will not tell us the 
direction of causation. As we shall see in the 
discussion section, such a correlation may have 
two explanations. The purpose of testing this 
hypothesis is simply to provide evidence that 
anyone interested in conformity should pay 
attention to group members’ perceptions of 
group norms. 

2. There will be a higher positive correlation 
between attitudes of group members and their 
perceptions of the group norms in groups in 
which there is a greater liking for the group 
than in groups in which there is less liking 
for the group by group members. Corollary: 
The greater an individual’s liking for a group, 
the greater the congruence between his atti- 
tudes and perceived group norms. 

3. The correlation between group members’ 
attitudes and perceptions of the group norm 
will be lower after participating in a group 
decision preceded by a group discussion than 
after listening to a lecture and writing an 
essay about the problem. 

This hypothesis is based on the assumption 
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that group discussion weakens forces toward 
congruence, which group decision can only 
partially restore. It does not contradict 
Lewin’s results. As we stated earlier, con- 
gruence was probably high in Lewin’s lecture 
groups. But it does contradict the assumption 
that discussion contributes to congruence. 


METHOD 


In order to test our hypotheses we need to create 
groups whose members differ between groups in the 
degree of liking for the group. While this might con- 
ceivably be done in groups meeting but once, it seems 
desirable to have groups meeting over a period of time, 
especially since we need to subject each group to differ- 
ing procedures for changing attitudes. For these reasons, 
and because of their availability, the experiment was 
carried out in elementary psychology classes at the 
University of Michigan. 

The measures of the attitudes used were Wang and 
Thurstone’s Attitude-toward-the-Treatment-of-Crimi- 
nals scale scored by Likert’s technique, Koch’s Attitude- 
toward-the-Freedom-of-Children scale scored by the 
Likert technique, and Likert’s Attitude-toward-the- 
Negro scale. 

Each of these tests was given as a pretest during the 
second and third weeks of the semester. Each student 
was asked to check his own attitude and to indicate by 
a zero the position on each item which “most of the 
class will check.” Approximately at the end of each 
month one of the three procedures for arriving at group 
norms described below was used in each section in ac- 
cordance with the experimental design (see Table 2). A 
week after the experimental treatment of each topic, 
the students took the attitude scale related to the topic, 
following the same procedure of indicating their own 
attitudes and the attitude of the class. Thus pre- and 
posttest scores for the attitudes of students and their 
perceptions of the attitude of “most of the class” were 
available. 

Students for the six sections involved in the experi- 
ment were not especially selected from the total en- 
rollment of the elementary psychology course at the 
University of Michigan. Each section consisted of 25 to 
35 undergraduate students with the largest number 
coming from the sophomore class of the literary college. 
Students enrolling for sections at these hours were 
assigned to these sections alternately, i.e., the first stu- 
dent registering was assigned to one section, the second 
student to a section taught by the alternative method. 
Students were not told that they were participating in 
an experiment. Each section met for one-hour periods 
three times weekly for a semester. 

Differences in cohesiveness. Three instructors each 
agreed to teach their two sections of the elementary 
psychology course in different ways. In order to build 
up differences in liking for the group and feeling of 
membership in the group, they agreed that their tech- 
niques should differ in (¢) opportunity of class members 
to know other members of the class, (6) amount of 
direct interaction between class members, and (c) num- 
ber of decisions which the class would be allowed to 
make about its own goals and procedures. 


TABLE 1 


MempBers’ LIKING FoR THE Group tn CLASSES 
Tavucut By DrrreRENT METHODS 








Group Mean SD 





Leader-centered classes 2.0 1.51 
32 i. 


Group-centered classes 





Specific procedures used were as follows: 

a. Members of the experimental or group-centered 
class introduced themselves at the first meeting and 
each member made a seating chart identifying other 
members of the class. In the control or leader-centered 
class only the instructor possessed a seating chart and 
only he introduced himself. 

6. In group-centered classes the instructor referred 
as much as possible of the discussion from student to 
student and refrained from interrupting student ex- 
changes. In control groups he commented upon or an- 
swered each student participation, so that interactions 
between students were mediated through him. 

¢. In group-centered classes the instructor gave 
direction at the beginning of the semester, but as the 
semester progressed, he referred more decisions to the 
group. Thus students in group-centered sections made 
group decisions on their assignments, the number and 
dates of tests, and even on having class and breakfast 
together in an especially reserved lunchroom. The 
decisions made in the experimental! groups about tests 
and assignments were also carried out in the control 
groups in which the instructor simply announced the 
assignments and tests. Thus assignments and tests were 
the same in both groups. 

These procedures were effective in producing the 
desired differences. On a scale on which students rated 
from —5 to 5 their dislike or liking for the group, 
students in the group-centered classes expressed 
significantly greater liking for their groups (p < .01). 
These results are presented in Table 1. 

Differences in group process of crriving at norm. The 
reason for using different techniques in presenting the 
attitute was to analyze more carefully some of the 
factors involved in group decision experiments. 

The first technique used was that of an open vote. I 
call this “group decision.”* Students discussed the 
problem, preliminary votes were taken on the alterna- 
tive solutions to the problem presented by the instruc- 
tor, and compromises made until agreement on a solu- 
tion could be reached. Observers recorded the facts and 
arguments used in the discussion, and these formed the 
content of the lecture given in the groups which used 
the other two techniques. The order of procedures in 
each group is indicated in Table 2. 

The second procedure was the giving of informa- 
tion and arguments on both sides of a problem in a 
lecture by the instructor. Following the lecture each 
student spent 10 to 12 minutes writing an essay on one 
of the six suggested alternative statements of attitude 
toward the problem. 


*It should be pointed out that my group decisions 
were in reference to attitudes. Usually this term refers 
to decisions about behavior rather than attitudes. 
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TABLE 2 
DESIGN OF EXPERIMENT 











Attitude to- 
ward the 
treatment of 
criminals 

Negro 


Attitude to- 
ward the 
Negro 


Attitude to- 
ward the 
freedom of 
children 

Criminals 


Lecture and 
secret vote 


Children 


Lecture and re- 
sult of vote 
announced 

Group decisien 


Negro Criminals Children 





GROUP-CENTERED SECTIONS 





Lecture and Children Negro Criminals 
secret vote 
Vote announced Crimina!s 


Group decision Negro 


Children 
Criminals 


Negro 
Children 





TABLE 3 


CORRELATION OF SHIFT OF ATTITUDE WITH CHANGE 
In Perception oF Group Norm 








Test r N 





-350 121 
-422 121 


Attitude toward the freedom of children 
Attitude toward the treatment of crim- 
inals 


Attitude toward Negroes 137 


391 





On the class day following the lecture, students were 
told, “Most of the students in this class chose the 
following alternative ....” In all cases the alternative 
chosen by the group decision was also that chosen by 
the majority in this, which I shall refer to as the “vote 
announced,” group. This technique was designed to 
test the effect of making a group norm explicit. 

The third technique was identical with the “vote 
announced” technique except that the results of the 
vote were not announced. I shall call this the “lecture” 
group. 

Thus, information and arguments about the subject 
were equivalent for all three methods. The last two 
methods were equivalent in all respects except an- 
nouncement of the norm, and the norm announced was 
the same in the first two procedures. 

Design. Since it would be inadvisable to treat each 
attitude by each method in each section, i.e., treat each 
topic three times in each section, it was not possible to 
use an ordinary factorial design. In a situation such as 
this, the latin-square design permits the maximum of 
sources of variance to be isolated. However, this experi- 
ment as blocked out in Table 2 is not a simple latin- 
square design, but is actually a double latin square with 
replications. 

Hypotheses were tested by computing the signifi- 
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cance of differences between the within-groups correla- 
tions derived by analysis of covariance.‘ 


RESULTS AND DISCUSSION 


Probably the first question one asks, al- 
though this was not the primary focus of the 
experiment, is, “did the variables produce 
shifts of attitude?” Significant changes in atti- 
tude toward the treatment of criminals and in 
attitude toward Negroes had occurred—a 
change which may not be startling but is all 
too rare in the record of the effect of teaching 
upon attitudes.® 

Relationship of shift of attitude to change of 
perceived group norm. We have seen that the 
students liked their groups. In fact, only one 
student expressed dislike for the group. 
Thus we have a situation in which we would 
expect the individual’s attitude to be in- 
fluenced positively by the group norm. In 
the introduction, I suggested that in such a 
situation the important variable is the sub- 
ject’s perception of the change in the group 
norm. If E considers only the objective group 
norm, he fails to account for many changes 
in scores while the norm has remained con- 
stant, or must invoke genii called “degree of 
suggestibility” or “contrasuggestibility.” 

As we predicted, correlations of the shifts 
in attitude of individuals with chanyes in 
perceived group norm were significantly dif- 
ferent from zero (p < .01). These correlations 
are given in Table 3. 

This finding indicates that our ability to 
predict attitude shifts is improved by con- 
sidering group norm perceptions. However, 
this finding does not show how this relation- 
ship is determined. It seems that either or 


‘ Formulae for treating this design were developed by 
Professor Paul S. Dwyer, Consultant in the Statistical 
Research Laboratory of the University of Michigan. 

5 Another interesting question is this: how is con- 
gruence affected by the attitude involved? Taking the 
correlations between attitude and perceived group norm 
separately for the three tests used, we find that the 
correlation is significantly less (p < .02) for attitude 
toward the Negro than for attitude toward the treatment 
of criminals. The correlation for attitude toward the 
freedom of children is not significantly different from 
either of the other two. This result, while peripheral to 
our main findings, is probably a good illustration of the 
fact that a given group does not have the same effect 
on all attitudes. In addition to the relevance of attitude 
to group functioning, some attitudes are undoubtedly 
more difficult to change because they were learned in 
the family or other important reference groups. 
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both of the following processes are involved: 
(a) An individual who shifts his attitude 
projects his own attitude shift onto the 
group and tends to perceive the other group 
members as having changed similarly. (3) 
An individual whose perception of the group’s 
attitude changes tends to shift his own attitude 
to maintain a similar relationship to the group 
norm. 

Further experimentation to reveal the exact 
operation of these processes should give us a 
clearer understanding of the traditional prob- 
lem of the “prestige” influence of majority 
opinion. 

The generality of this research is limited by 
the fact that all but one of the students in 
this experiment were either neutral or posi- 
tively oriented toward their groups as indi- 
cated by their responses on the liking for the 
group scale. Nevertheless, in some cases even 
the individual who is negatively oriented to- 
ward the group may shift his attitude in the 
same direction as he perceives the group’s 
attitude to be changing. Ordinarily we think 
of the individual who is negatively oriented 
toward the group as shifting his attitude in 
the opposite direction from the group norm. 
Actually the direction of the shift may depend 
upon the situation. If the change in the per- 
ceived group norm increases the distance be- 
tween the individual and the disliked group, his 
attitude may not shift. But, if the perceived 
group norm shifts toward him, he may shift 
his attitude in the same direction to maintain 
the same degree of nonconformity. 

Relationship of attraction-to-group and con- 
gruence. Our second hypothesis predicted that 
in the classes in which there was more liking 
for the group and feeling of membership in 
the group there would be a higher correlation 
between attitudes and perceived group norms. 
This prediction about congruence on the post- 
test was not only not verified, but as the data 
in Table 4 indicate, if I had used a two-tailed 
test, congruence would have been signifi- 
cantly lower in the group-centered classes. 

That congruence is not a simple function of 
cohesiveness is also indicated by the finding 
that within the groups the degree of liking 
for the group was not significantly related to 
the degree of congruence. 

Despite the fact that our hypothesis about 
congruence was not confirmed, students in 
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TABLE 4 


CONGRUENCE IN LEADER-CENTERED AND 
GROUP-CENTERED CLASSES 


PostTEst 








Group N 





Leader-centered classes 
Group-centered classes 





TABLE 5 
MEAN DIFFERENCE OF ATTITUDE FROM Group Norm 








AttitcupE TEstTs 
Group CHILDREN NEGRO CRIMINALS 


5.6 7.9 
5 





Leader-centered 6.2 
Group-centered 7.3 


3.6 6 


F = 5.24 for 1 and 252 df. 
2 < 0S. 


group-centered classes actually did conform 
to the group norm® more closely than did 
students in leader-centered classes. (See Table 
5.)? 

How can these results be explained? 

One group of variables is composed of fac- 
tors affecting one’s perception of the group 
norm. Here we have such factors as objective 
clarity of the norm, misperceptions due to 
personality defenses, etc. On the surface it 
would appear that these variables cannot alone 
account for our result because our groups were 
not significantly different in the accuracy 
of their perception. 

Another group of variables is that having 
to do with the relationship between one’s 
own attitude and the perceived group norm, 
or congruence. Festinger and his colleagues 
have shown cohesiveness to be of importance 
in determining conformity. Because the mem- 
bers of a cohesive group are more strongly 
motivated to remain members, we would ex- 
pect greater fear of rejection and, conse- 
quently, greater conformity to group norms. 
Our group-centered classes were more cohesive 
than our leader-centered classes (using Festin- 


* “Group norm” here refers to the mean score of the 
group on an attitude test. 

7 The significance tests for this table and Table 7 
should be interpreted with caution. Since Bartlett’s test 
indicated that the assumption of homogeneity was not 
justified, a log transformation was applied. However, 
even after this transformation, Bartlett’s test was 
significant at the .05 level. 
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TABLE 6 


CONGRUENCE AFTER DIFFERENT GROUP 
PROCESSES WERE USED FOR 
CONSIDERING A PROBLEM 








PROCEDURE r 





507 
-498 
.320 


Lecture 
Vote announced 
Group decision 





ger’s definition of cohesion as “attraction of 
group”). Yet, these group-centered classes 
showed less congruence than the : leader- 
centered classes. How can we explain this? 

We have already suggested that need to be 
accepted by the group is one of the major 
motives for conformity. But groups differ 
in the degree to which nonconformity is 
punished. In a group such as ours, in which 
there has been a good deal of interaction be- 
tween members, the group member should 
be able to develop a fairly good idea of what 
behavior the group will reward, what it will 
ignore, and what it will punish. Perhaps a 
good deal of his feeling of security in a group 
depends upon his knowledge of these limits. 
It seems probable that in most democratic 
groups the pattern of rewards and punishments 
is such that the group member will learn to 
cooperate on issues where uniformity of be- 
havior is necessary to group progress. How- 
ever, such groups are likely to permit or even 
reward individual variation on problems 
which require individual rather than group 
action. This ability to differentiate between 
areas where conformity is necessary and where 
it is not necessary may not only be a measure 
of the security of the individual group member 
but also, when summed for the whole group, 
may be an important dimension related to 
the group’s effectiveness in problem solving. 

Relationship to congruence of procedures used 
in arriving at norm. Our third hypothesis stated 
that congruence will be less after a group de- 
cision than after a lecture. As indicated in Table 
6, our results showed that congruence was 
significantly lower (p = .03) following group 
decision than following a lecture. 

Again this result may be interpreted in 
terms of rewards and punishments involved in 
the group process. If we assume that one of 
the primary motives for a group member’s 
conformity is his need to gain acceptance or 
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avoid rejection by the group, the function of a 
discussion becomes more apparent. If the dis- 
cussion is one in which the group member 
hears many divergent attitudes expressed 
and if these deviations are tolerated by the 
group, the forces toward conformity will be 
weakened. On the other hand, if conforming 
statements are rewarded and deviation results 
in rejection, the forces toward conformity 
will be increased. Our group discussions pre- 
ceding group decisions were extremely per- 
missive, and it is not surprising that con- 
gruence was reduced. 

Let us turn now to the factors which result 
in a discrepancy between the “real” group 
norm and the perceived group norm. These, 
too, help us to understand the effectiveness of 
group decision. 

One of these factors is the ambiguity or 
“clarity” of the norm. We know from many 
studies that the individual’s needs and past 
experiences are involved in his perception of a 
social situation. The more ambiguous the 
situation the more these individual factors 
enter into perception. Thus, an ambiguous 
norm can easily be interpreted differently by 
each group member. The result is a low degree 
of conformity. 

Too often we have assumed that if E told a 
group, “the group norm here is so and so,” 
each member would perceive the norm in the 
same way. Unfortunately, not all subjects 
trust psychologists, and if their needs to dis- 
believe in a particular norm are strong, they 
are likely to dismiss the announcement. 
Consequently, they maintain conformity to a 
group norm which is more agreeable to them. 
This, I think, is one of the explanations for 
the high degree of congruence in our “lecture” 
procedure, where the norm was relatively 
vague and unknown. 

Nevertheless, the objective situation is an 
important factor in perception, and one of the 
features of group decision is that it makes the 
group norm clearly perceptible to members of 
the group. While congruence was low in our 
“group decision” procedure, students’ atti- 
tudes in this group were actually significantly 
closer to the “real” group norm than in the 
other techniques (see Table 7). These students 
were more accuraie in their perception of the 
group norm as indicated in Table 8. While the 
discussion had weakened the pressure felt by 
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the individual to align his attitude with that of 
the group, the vote had made clear the norm 
to which he was relating, and the resulting 
conformity was thus greater than in other 
groups. 

Thus our findings indicate that clarity of 
the norm is an important factor in conformity 
and that a permissive discussion weakens 
tendencies to conform. 

With these findings Lewin’s “group dis- 
cussion” experiments may be more clearly 
interpreted. Lewin describes three phases of 
group decision—unfreezing, change of level, 
and freezing at a new level. 

The phase of “unfreezing” is accomplished 
by lessening the forces toward conformity to 
the old norm. If one of the forces toward con- 
formity is the threat of nonacceptance by the 
group when one diverges from the norm, letting 
individuals present varying points of view 
and accepting these divergent opinions with- 
out punishment should remove some of the 
fear of diverging from the norm. 

The step, “change of level” of behavior, 
requires strengthening forces directed toward 
the new level or reduction of forces directed 
away from the new level. In the women’s 
groups in which Lewin was attempting to 
change food habits, the “unfreezing”’ discussion 
was accompanied by dietitians’ recipes and 
information aimed at weakening or removing 
some of the forces which had been preventing 
serving of the experimental foods. Thus, 
forces toward serving the experimental foods 
became proportionately stronger during the 
discussion. 

“Freezing behavior at the new level” in- 
volves reinvoking forces to conformity to the 
new norm. This is accomplished by a hand 
vote and in at least one of Lewin’s experiments 
by the knowledge that a check on behavior 
would be made by E. In terms of my empha- 
sis upon the perceived norm, I interpret this 
as a method of making the new norm clearly 
perceived and unambiguous. Unanimity in 
the decision is important, therefore, not only 
because it makes the norm clearer, but because 
it re-emphasizes the necessity for conformity 
if group acceptance is to be obtained. 

In the light of this theory, Lewin’s lecture 
groups were unsuccessful in changing behav- 
ior because change of level was attempted 
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TABLE 7 
MEAN DIFFERENCE OF ATTITUDE FROM Group Norm 








Attitupe TEsts 


PROCEDURE CaItpREN NEGRO CRIMINALS 





Lecture 7.7 
Vote announced 7.0 
Group decision 5.6 





F = 2.66 with 2 and 252 d/. 
2 < .10. 


TABLE 8 
MEAN DIFFERENCE OF PERCEIVED Group Norm FROM 
Group Norm Arrirupe Tests 





PROCEDURE CHILDREN NEGRO CRIMINALS 


Lecture 6.7 7.6 7.9 
Vote announced 5.9 4.4 5.1 
Group decision 5.8 4.4 4.4 


F = 5.91 with 2 and 252 df. 
? < .01. 








without unfreezing. As a result, conformity to 
the old standard of behavior was very high. 

Lewin’s discussion groups were less suc- 
cessful than group decision because although 
unfreezing and changes of level were carried 
out, the new group norm was not made clear 
and perceptions of the norm probably varied. 
In addition, in both groups the forces for con- 
formity were probably weakened if varying 
viewpoints were accepted during the discus- 
sion. 

The group-decision technique was succ::ssful 
because al] three phases had been carried out. 
In addition, it is possible that the discussion 
before the group decision may actually have 
increased the attractiveness of the group, 
which may have helped counteract the effect 
of lack of punishment of divergent viewpoints. 


SUMMARY 


1. To study the relationship of the indi- 
vidual’s attitudes to group norms, experimen- 
tal classroom situations were set up involving 
three sets of variables: (a) The relationship 
between attitude change and changes in per- 
ception of the group norm. (5) The relation- 
ship between attraction to the group and 
“congruence” of attitudes and perceived 
group norms. (c) The effect of different group 
processes used in considering a problem upon 
congruence. 

2. Attitude changes were found to be sig- 
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nificantly correlated with changes in group 
norms. 

3. Classes taught by a group-centered tech- 
nique created greater member-liking for the 
group than leader-centered classes, but con- 
gruence was less in group-centered classes. 

4. A group-decision technique resulted in 
less congruence but greater conformity than a 
lecture. 

5. The findings are interpreted in terms of a 
theory emphasizing the importance of the 
distribution of rewards and punishments ad- 
ministered by the group for conformity and 
the discrepancy between the objective group 
norm and the perceived group norm. 
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PROJECTIVE METHODS AND VERBAL LEARNING 


BARBARA NORFLEET COHN 
Harvard University 


T* projective techniques are commonly 
believed to reveal dominant motiva- 
tional forces. They are believed to 
perform this function because the subject (S) 
is given ‘‘several degrees of freedom to organize 
a plastic medium in his own way, and since 
little external aid is provided from conven- 
tional patterns, he is all but obliged to give 
expression to the most readily available forces 
within himself” (14, p. 215). Two dimensions 
seem to be involved: (a) freedom, or a large 
number of response alternatives, and (5) 
conventionality, or degree of “familiarity.” 
Actually, wher. viewed from these two dimen- 
sions, the projective techniques are not uni- 
formly “free” nor “unfamiliar.” The Rorschach 
seems to be at the free, unfamiliar poles; the 
TAT is located more at the free, quite familiar 
poles. Word-association tests are like the TAT 
in this respect. Tachistoscopic tests of per- 
ceptual sensitivity run the gamut of all 
possible combinations of the two dimensions. 

Of course it is realized that many of the 
verbal responses to projective tests represent 
mere familiarity or recency. It would be sur- 
prising to encounter a medical student who 
did not use some anatomical vocabulary in 
his imaginative productions. Thus, in tachisto- 
scopic studies, a very powerful variable, 
word frequency, demonstrates that much so- 
called “projective material” is merely a 
reflection of vocabulary habits. How, then, can 
we say that motivational forces are revealed 
by such projective techniques? The answer 
seems to lie partly in the specification of both 
the freedom-for-responding granted to S, 
and in the degree of familiarity of the stimulus 
material presented to S. Probably the rest of 
the answer lies in the hedonic or motivational 
associations related to the stimulus material 
as well as to the responses required of S. 
These problems of specification can be readily 


1 This research was supported by the Laboratory of 
Social Relations, Harvard University. The author is 
particularly indebted to Professor Richard L. Solomon 
for many helpful suggestions in the planning and 
carrying out of this study. Drs. Leo Postman and D. 
O. Hebb suggested many ideas and improvements. 


restated in terms of an S-R analysis of verbal 
learning, and they suggest a defining experi- 
ment. 

For this ‘task two assumptions are neces- 
sary: (@) familiarity of stimulus materials 
and frequency of prior exposure are almost 
synonymous, and (6) the aftereffects of 
reward and punishment give stimulus materials 
a “motivational character” that mere fre- 
quency of repetition or exposure cannot give. 

The number of repeated presentations of 
verbal materials, i.e., frequency (7, 11, 13), 
and the rewarding, punishing, or neutral 
aftereffects of each presentation, i.e., conse- 
quence (2, 8, 9, 10), are known to affect the 
amount of verbal learning. Further, it is 
known that the various methods of measuring 
verbal learning and retention, such as recog- 
nition and free recall following a constant 
learning procedure, give differing quantitative 
values (1, 4, 5, 6). However, it is not known 
how the different methods of measuring verbal 
learning interact with the systematic varia- 
tion of antecedent conditions such as fre- 
quency and consequence. This paper presents 
data on the following questions: 

1. Are some methods of measuring verbal 
learning more sensitive than others in differ- 
entially revealing the effects of the antecedent 
conditions of frequency and cunsequence? 

2. Are there common dimensions among 
methods of measuring verbal retention which 
might account for differing sensitivities of 
this kind? 

It is clear that these two questions are 
referable to the original problem concerning 
the mode of operation of projective procedures. 
If we vary degree of familiarity of verbal 
stimuli jointly with consequence (reward or 
punishment or neutrality), we are controlling 
S’s relationship to certain stimulus materials. 
If we vary the methods of measuring verbal 
learning, we, in fact, vary 5’s freedom of 
response; e.g., the recall test offers fewer 
restraints than does the recognition test. 
In seeking to discover how the antecedent 
conditions (frequency and consequence) inter- 
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act with measuring procedures, we elicit a 
further question: what are the conditions most 
conducive to the revelation of the effects of 
consequence and the motivational substrate 
of verbal responding? It is not clear to us 
whether we are asking about the character- 
istics of a “good” projective test; but that 
question is interesting to ask in the light of 
this study, and one answer is, most certainly, 
that the “goodness” of such a test is con- 
ditional upon the frequency and consequence 
of antecedents. 


METHOD 
The Learning Situation 


In order to demonstrate simultaneously the effects 
of frequency and consequence on verbal learning, as 
well as the role of the various measurement procedures 
in coloring the apparent effects of these two inde- 
pendent variables, it was necessary to subject all Ss 
to the same initial learning situation. The Ss were first 
exposed to the materials which they would later be re- 
quired to use in one of the measurement situations 
(test situations). However, £’s actual purpose in carry- 
ing out the experiment was kept from the Ss. Each S 
was told that the experiment was on language and pro- 
nunciation; that in the course of language-training re- 
search it had been discovered that students had great 
difficulty in pronouncing foreign words; and that E 
was studying these findings. The S was told that the 
words he would be shown were from an Arabic dialect, 
and that his job was to pronounce each word as well 
as he could. He was also told that some words were 
made up by E merely to see how S pronounced them 
and that there was no correct or incorrect pronouncia- 
tion of these words. 

Each S wasshown seven-letter nonsense words, one at 
a time. Each word was typed on its own file card, and 
was held up in front of S until S pronounced it. The S 
was immediately rewarded or punished for his pro- 
nunciation; or, if the word was one of those of the class 
stated to have been made by E, neither reward nor 
punishment was administered. The time interval be- 
tween pronunciations varied with the ease of pronun- 
ciation of a given nonsense word. The Ss’ responses 
to neutral and punished words were somewhat more 
delayed than were responses to rewarded words. The 
words were made up in such a way that repeating letter 
patterns and the associative strength with English 
words were kept at a minimum. The word cards were 
shuffled before testing each S. 

Each nonsense word has assigned to it a fixed fre- 
quency of occurrence in the pack of word cards. There 
were a total of 246 word cards: Six words appeared 
but once, six words appeared five times, six words ap- 
peared ten times, and six words appeared 25 times. The 
S’s pronunciation of each nonsense word was followed 
immediately by ome of these consequences: He was 
told that his pronunciation was correct, and he was 
rewarded with a poker chip (exchangeable for money 
at the end of the experiment); he was told that his 
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TABLE 1 
EXPERIMENTAL DESIGN AND STIMULUS WorpDs 








_— FREQUENCY 
QUENCE d eee oy 
1 a 25 


puhasej 
rimakeb 


mehbiroj 
fonaped 


Rewarded | jasonup | tecugaw 
| gimadaz | dofukal 





labeduf 
silucer 


hebosam | 
nevucog 


Punished kuvalog 


wudizom 


pikoven 
vozimet 





zakebif 
gewirac 


sujefiv 
howanil 


Neutral zupajic 


noguvis 


rujowet 
batiwos 
} 











6 words 6 words 


6 words 6 words 
— t 





_ | _— 
Exposures jo x l= 66 X 5 = w0l6 x 10 = 60\6 X 25 = 130) 
| | 








pronunciation was incorrect, and he was punished by 
forfeiture of a poker chip to E; he was neither rewarded 
nor punished, and no poker chips changed hands. The 
effects, or consequences of pronunciation, were actu- 
ally applied to a specific word according to a prepared 
design, and the administration of reward and punish- 
ment had no relation to the pronunciation behavior of 
the Ss. Permutated across these three conditions were 
the four frequency conditions. 

The assignment of particular nonsense words to 
frequency-consequence categories was varied from S 
to S. The general design is shown in Table 1. Eight 
words were followed by reward, eight were followed by 
punishment, and eight were followed by neutral con- 
sequences. Actually, in order to control for possible 
inherent differences in difficulty of pronunciation and 
retention, the words were randomly assigned to each 
cell of the experimental design for each S. Order of 
presentation was randomized by shuffling the pack of 
word cards between Ss. If a word was found to follow 
itself, this word was randomly placed into another 
position in the pack. This randomization procedure, 
while not as acceptable as a systematic factorial de- 
sign, was more practical because it imposed no mini- 
mum number of Ss to complete the design. The ran- 
domization procedure prevented study of order effects 
and effects due to inequivalence of the nonsense words. 
The design allowed study of only the effects of fre- 
quency and consequence. 


The Test Situation 


Immediately after the 246 word presentations and 
pronounciations, each S, without prior warning, was 
exposed to one test situation. Learning appears to take 
place when an S seems to be strongly motivated to 
accomplish some task other than the memorizing of 
the material in question; thus, these test situations may 
be considered to measure incidental learning or its 
analogue in anima! experimentation, “latent learning.’ 
(In this experiment the existence of the incidental or 
latent-learning phenomenon is not at question.) 

The Ss were divided into four groups for testing 
purposes. (2) Ten Ss were asked to recall all the words 
they could remember from the exposure situation. This 
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is the free recall test, and the score was the number of 
the 24 possible words correctly recalled. 

(b) Ten Ss were asked to recognize words from the 
exposure situation contained in, or buried in, a list 
containing 24 unfamiliar, but similarly constructed 
nonsense words. This is the recognition test, and the 
score was the number of words correctly identified, 
corrected for guessing. 

(c) Ten Ss were asked to recognize words exposed 
tachistoscopically. This is the tachistoscopic recognition 
test. Among these words were the 24 from the exposure 
situation; 24 new, unfamiliar, but similarly con- 
structed nonsense words; and 12 seven-letter English 
words. They were presented for threshold determina- 
tion in random order. The score for each word was the 
visual duration threshold (in .01 sec.) for correct recog- 
nition, using the ascending method of limits, a recog- 
nition criterion of two successive correct responses, 
and exposure-increase steps of .01 sec. 

(d) Ten Ss were given anagrams of the 24 words from 
the exposure situation. The anagrams contained the 
seven letters needed to compose a word from the ex- 
posure situation, but letter order was scrambled. This 
was the anagram-reconstruction test, and the score was 
the number of anagrams correctly solved, with a 3-min. 
time limit for each anagram. Each S was tested indi- 
vidually in both the learning situation and test situa- 
tion. 

The following limitations of the procedure should be 
kept in mind: The anagram test usually took a little 
over an hour to complete, while the recall test took 
approximately 10 min.; the tachistoscopic recognition 
test took over 2 hrs. in some cases. Thus, there could 
well be differential forgetting confounded with the dif- 
ferent testing conditions. Such differences are not, 
however, confounded with the independent variables. 
Secondly, recency effects were left partly uncontrolled 
since the pack of word cards was shuffled between Ss. 
Other conditions being equal, the high frequency words 
should be more often among the most recent words. 
This will be true of any random series of stimuli where 
frequency of stimulus classes is varied. A prohibitive 
number of Ss would be required in order to control 
for the confounding of frequency and recency in this 
experiment. However, three “dummy” words were 
placed at the end of the pack of cards to help minimize 
recency effects. 


RESULTS 


The role of frequency. The mean learning 
scores as a function of frequency are given in 
Table 2 and are plotted in Figs. 1 and 2. 
The tachistoscopic data were transformed 
into logarithms to reduce the distortion in- 
duced by deviant high thresholds. Clearly, 
duration thresholds vary inversely with fre- 
quency of prior exposure, while recall, ana- 
gram, and recognition scores vary positively 
with frequency. Of the 48 cells, the score in 
only one fails to go in an orderly. direction 
(recall, frequency of 10, consequence, neutral). 

The role of consequence. The mean scores 
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- TABLE 2 
MEAN SCORES IN THE Four Test SITUATIONS 


RECALL 








Consr- FREQUENCY 


QUENCE 





10 25 





Rewarded . . 70 
Punished .§0°? 
Neutral 10°° 











Mean .43** 





ANAGRAMS 





pera | 0 .30 
Punished o* .50 
Neutral | 0 .30* 

| 





Mean o* —_—" 





RECOGNITION (IN HUNDREDTHS 


oF SECONDs) 


TACHISTOSCOPIC 





ae 20 12 19 
—” -.a2 .22 
.18* ll .23 


Rewarded .27 
Punished ae ae 
Neutral .38 .23* 





7e* 12 | 21 


Mean . we 4 





RECOGNITION (CORRECTED FOR GUESSING) 





1.88 | 
1.88 
1.88 


1.44 
1.51 
1.38 


1.58 
1.78 
1.68 


| .35** 
—” 
.35** 


Rewarded 
Punished 
Neutral 





Mean a” tie” te ie 








Note.—The percentage of correct words to total possible cor- 
rect words may be easily computed for the recall, anagram, and 
recognition tests: Dividing any cell by 2 will give the mean per- 
centage of correct words for that cell. 

*¢ tests on individual scores indicate that this frequency 
score is significantly (.05 level) lower than the score obtained 
by the adjacent ascending frequency. 

**% tests on individual scores indicate that this frequency 
score is significantly (.01 level) lower than the score obtained by 
the adjacent ascending frequency. 

+f tests op individual scores indicate that this consequence 
(punished or rewarded) is significantly (.05 level) higher than the 
score obtained by neutral. 


as a function of consequence are also given in 
Table 2. Means for consequence for all fre- 
quency. conditions are plotted in Fig. 3. 
The results for consequence are not as clear 
as were the results for frequency. The results 
of ¢ tests indicate that tke effects of conse- 
quence are significant only for the following: 
the recall test; the difference between re- 
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e=recognition 
o=anagrams 








x =recall 


a ‘ 





10 25 
FREQUENCY 


Fic. 1. MEAN SCORES ON THE TESTS AS A FUNCTION 
oF FREQUENCY OF PriorR EXPOSURE 
TO THE VERBAL MATERIALS 


The smooth curves are theoretical. 
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2.00 
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< 
roa 
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fake) 
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Ow 
so 

7 
a 
<q 
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FREQUENCY 


Fic. 2. MEAN LoGariTHM OF VisUAL DURATION 
THRESHOLDS FOR VERBAL MATERIALS AS A FUNC- 
TION OF FREQUENCY OF PRIOR EXPOSURE TO THE 
MATERIALS 


The smooth curve is theoretical. 


warde and neutral words at a frequency of 
10; between punished and neutral words for 
all frequencies. There were too many empty 
cells to do an analysis of variance in the recall 
and anagram tests, but an analysis of variance 
on individual scores in the recognition and 
tachistoscopic tests demonstrated that fre- 
quency was highly significant and that conse- 
quence was insignificant. 
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OT. (short) 
x=recall 


e=recognition 
A=T. (long) 
5 

Osanagrams 





MEAN SCORE 








i i 


+ 0 
CONSEQUENCE 


Fic. 3. MEAN SCORES ON ALL TESTS AS A 
FUNCTION OF THE THREE CONSEQUENCE 
CONDITIONS 

The tachistoscopic test scores have been divided 
into scores for long-flash durations and short-flash 
durations. The function for short-flash durations falls 
directly on that for free recall, and so two separate 
curves could not be plotted. 





Since significant differences for consequence 
are found only in the recall test, a more 
sensitive test is probably needed to show the 
relative effects of consequence for the four 
measuring situations, however slight the 
effects may be. The total of rewarded- and 
punished-consequence mean scores as a 
percentage of total mean score is such an 
index.? If consequence has no effect, one 
would expect the percentage to be 66%, 
since two-thirds of all words presented are 
either rewarded or punished. The obtained 
percentages were 75.9% for the recall test, 
68.9% for anagrams, 67.7% for the tachisto- 
scopic test, and 66.7% for the recognition 
test. 


DISCUSSION 


Replies to the two questions posed earlier 
in this article may now be attempted: 

1. Are some methods of measuring verbal 
learning more sensitive than others in re- 
vealing the effects of antecedent conditions? 
Figure 1 demonstrated that, regardless of the 
test used to measure memory, frequency of 
prior exposure has a homologous effect on 
what is manifestly remembered. The shapes 
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TABLE 3 
K CONSTANTS FOR THE Four Test SITUATIONS 





FREQUENCY 


MEAN 





5 10 25 








039 
039.056 .038 .044 
.084 .089 .098 .085 .089 
.147 .223 .169 .107 =. 162 


.024 .033 .032 
Anagrams 
Tachistoscopic 
Recognition 


of the curves relating the frequency variable 

to test score are surprisingly similar, although 

their intercepts and asymptotes depend on 

the specific characteristics of each test. All 

these curves are fitted reasonably well by the 
—n log iz 


function H = M — Me eee This repre- 


sents a growth curve: H is the mean score for 
any given frequency; M is the physiological 
maximum (in all the tests but the tachisto- 
scopic test this equals 2 and each S could 
achieve a perfect score of 2 at any given 
frequency); e is 10; m is the frequency; and K 
is the measuring constant for each test.* K is 
a constant fraction of the difference between 
M and the score of the preceding frequency. 
Thus, the function rises to the physiological 
M by increments H, — Hp; which are a 
constant fraction K of the remaining growth 
potential M — H,_,. K may be thought of as 
an index of associative strength needed for 
performance. 

Table 3 gives K constants for the four test 
situations. Although the tachistoscopic score, 
recorded in terms of time, requires some 
modification of the equation,‘ the equation 


2 Since a high score in the tachistoscopic test indi- 
cates an absence of learning, it was necessary to trans- 
form these scores. The method used was complex, and 
cannot be described here. 

+See Hull, C. (3, ch. VIID), for a detailed explana- 
tion of this equation. The present analysis of frequency 
is similar to his analysis of the relationship between 
habit strength and number of reinforcements. 

* The equation for the tachistoscopic situation is 

—n leg i: 
Y = (Y¥, — M’)e + M’'; Y = score in terms 
of mean logarithms of time; Y, is the intercept of the 
function on the Y axis; M’ = an approximation of the 
point at which the function becomes asymptotic to 
some value of F¥ greater than zero (the value of 1.33 
was found to be a good approximation here). The 
remaining symbols have identical meanings to the 
equation used for the other test situations. This equa- 
tion is identical to the one used above. The range of 
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does allow a comparison of this situation with 
the other three test situations. Thus, we find 
that the different test situations do not appear 
to distort the effects of frequency of prior 
exposure. All functions of the frequency 
variable are approximately fitted by the same 
equation with different constants for the 
various tests. 

Since no continuum exists for consequence 
in the present design, no curves can be com- 
pared. But consequence, unlike frequency, 
does indicate that some methods of measuring 
verbal learning are more sensitive than other 
methods in revealing its effects. Recall is 
most sensitive and recognition least sensitive. 

2. Are there common dimensions among 
methods of measuring verbal retention which 
might account for differing sensitivities as to 
the effects of consequence? One relationship 
between the different methods of measuring 
verbal retention and the effects of consequence 
is clear: As the mean score for the measure- 
ments increases,’ the effects of consequence 
become harder to demonstrate. Recall, with 
the lowest mean score, varies significantly 
with the consequence variable; while recogni- 
tion, with the highest mean score, does not 
do so. This indicates that the “easier” the 
measuring test is for S, the less will the effects 
of the consequence variable have an oppor- 
tunity to become manifest. Other conditions 
being equal, two dimensions appear to deter- 
mine what makes a test simpler rather than 
more difficult: (@) The amount of stimulus 
pattern common to the learning and test 





H alone is altered. The author will be glad to send to 
anyone who is interested a detailed explanation of how 
it was worked out. 

5 Bearing in mind likelihood of error in points found 
by extrapolation, we can use the equation for the growth 
curve and the measuring constant for each test (K) to 
make some estimate of the frequencies at which the 
curves for the other conditions would reach 1.88/2.00 
of the maximum (Ss in the recognition test, the easiest 
of the tests used, made this score at a frequency of 25). 
Solving this equation for m, and substituting the proper 
K constants, gives the following estimates of the fre- 
quencies theoretically needed in the other test situa- 
tions to reach the score attained in the recognition 
test. Recall would need a frequency of 87, anagrams 
63, and the tachistoscopic test 31. These predictions 
are not very solid. Massing, intraserial effects, degree 
of similarity between words, etc. would certainly cause 
the data to deviate from the theoretical curve. But 
these computations do suggest how much more diffi- 
cult are some test situations than others. 
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situations, and (6) the sample size of possible 
responses available to S in the test situation 
(in information terminology, the ‘‘ensemble’’). 
The measured effect of consequence varies 
inversely with the former, and directly with 
the latter.® 

The implications of these findings for 
projective techniques are interesting. The 
projective test can be seen as an extreme 
example of a lack of specific stimulus pattern 
common to the test situation and any previous 
learning situation, and the sample size of 
possible responses available to S is almost 
completely uncontrolled. Thus, one would 
expect material of this type to elicit verbal 
responses which are closely related to the 
motivations of S. Frequency or familiarity, 
on the other hand, would not have much 
influence in revealing effects of S’s motiva- 
tions. Words of low frequency were not re- 
called any more easily when they had been 
followed by reward and punishment than 
were words of high frequency. As found experi- 
mentally, the important factors are (a) the 
lack of external aid, and (6) the freedom to 
give any response in the test situation, rather 
than the familiarity or unfamiliarity of the 
test materials themselves. This finding helps 
us to understand why the TAT and word- 
association tests may be successful projective 
techniques in spite of the high familiarity of 
the material with which they deal. 

In addition, there is a good check on this 
interpretation to be found in the tachisto- 
scopic test. This situation covers the entire 
range of conditions from recall to recognition 
in the sense that very short exposure durations 
are similar to the recall test (very little of the 
stimulus pattern from the exposure situation 


§ It should be noted that these dimensions might be 
“overcome”’ in any particular test situation by stacking 
the cards deliberately against them. For instance, in 
the recognition test one could ask S to choose the 
“‘correct”’ words from a group of words so similar to 
the original that the amount of stimulus pattern in 
common between the exposure and test situations 
would become relatively unimportant in determining 
the score. In the present experiment the predictions 
based on these two dimensions are confounded, but an 
experiment could be designed to isolate them from one 
another; for example, one could enlarge the sample 
size of possible responses by placing the word to be 
recognized in a list of 1,000 different words. Such pos- 
sibilities do not, however, erase the relevance of these 
two dimensions in interpreting the results of our ex- 
periment. 
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is present in the test situation), and very 
long exposure durations are similar to the 
recognition situation (most of the stimulus 
pattern from the exposure situation is present 
in the test situation). But how should we 
decide what is a low threshold and what is a 
high threshold in the tachistoscopic situation? 
This decision can be made in a somewhat 
arbitrary, but not unreasonable, way. The 
mean score for recall was .48, which represents 
correct recall of 24% of the words; therefore, 
a visual duration threshold to represent recall 
was chosen to include the lowest 24% of the 
individual tachistoscopic scores. This thresh- 
old was .05 sec. The threshold for recog- 
nition was chosen in the same way. The mean 
score for recognition was 1.32, or correct 
recognition of 66% of the words. The cor- 
responding visual duration threshold was .10 
sec., a value which included 66% of 
threshold scores. These scores were then 
studied for the effects of consequence. The 
equivalent tachistoscopic recognition scores 
showed no effects of consequence, and the 
percentage of rewarded and punished mean 
scores to total mean score was 67.9, as com- 
pared with 66.7 for pure recognition. But, the 
equivalent recall scores showed a significant 
difference between punished and neutral 
words (tf = 2.6; df = 9; p = .05), and the 
percentage of rewarded and punished mean 
scores to total mean score was 75.9% which 
is identical to the percentage found for pure 
recall. Therefore, the tachistoscopic pro- 
cedure produced data which behaved as 
though it were proceeding from recall to 
recognition memory. It is interesting that 
some writers have labeled the procedure a 
perceptual test, whereas here it appears 
entirely reasonable to talk about it as a 
learning measure. 

These data indicate that there may be a 
middle ground of agreement in the controversy 
over the relative contributions of word fre- 
quency and motivational factors in deter- 
mining ease of perception of stimuli presented 
under difficult viewing conditions. As the 
tachistoscopic situation approaches a pro- 
jective test, in terms of allowing a wide range 
of possible responses in the presence of little 
external aid from the stimulus material, 
verbal responses tend to reflect motivational 
factors; but as it becomes more restricting, 
frequency tends to be the overruling influence. 
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The consistency of these findings would 
indicate that the different methods used 
here to measure verbal learning do differ in 
their sensitivity in revealing the effects of the 
antecedent condition of consequence. The 
tests can be ordered according to two dimen- 
sions which help to explain these differences. 

One may wonder why consequence does 
not “show up” more in this experiment. It is 
possibie that the test situation employed did 
not enable S to use the words in such a way 
as to make the consequences from the learning 
situation “meaningful.” A final test was run 
which required that ten Ss recall the words, 
but recall them in such a way that the conse- 
quences of a word would be important. In 
place of free recall, S was asked to recall 
those words which he would “feel most con- 
fident in using if he visited Arabia’”’; those he 
would “feel least confident in using”; and 
those he would not “know about.” The S 
was told to use all words he had formerly 
pronounced, and the three classifications were 
requested in different orders from S to S. 
As a precautionary measure, S was told the 
three classes before he was allowed to give 


any response. As expected, consequence had 
fairly large effects in this case: between 


punished and neutral words, ! = 3.9 which 
at 9 df has a p value between .02 and .01; 
between rewarded and neutral words, t = 2.1 
which at 9 df is just short of the .05 level. 
The percentage of rewarded and punished 
mean score to total mean score was 77.6 in 
comparison to 75.9 for the recall test. Clearly, 
this test was more difficult for S, as indicated 
by the lowest mean score yet attained, .41 
out of a possible 2.0. 

The methods used here to measure verbal 
learning indicate that, as far as the effects of 
consequence are concerned, tests differentially 
measure the learning (or memory) process. 
It may be that Thorndike (9, 10) found con- 
sequence to be of such great importance 
because he used the free recall test almost 
exclusively to measure verbal learning. 
However, Tolman (12) found that consequence 
may not be very important in determining 
what is learned, but his rats were placed in a 
“recognition” situation. It may be that the 
kind of measurement used is even more funda- 
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mental: When performance is measured (S’s 
particular pronunciation of a word), conse- 
quence may be a strong determinant; but 
when learning, or memory, is measured 
(scores in later test situations) consequences— 
or the law of effect—may be a fairly weak 
determinant. Such would be the case in this 
experiment; Ss did repeat correct pronuncia- 
tions and change incorrect pronunciations 
during the learning situation, but they learned, 
and remembered, a large number of both 
punished and neutral words. Thorndike fre- 
quently measured “performance” rather than 
learning or memory. This _ interpretation 
might be accepted by proponents of latent 
learning, but it should be remembered that, 
as the test situations became more difficult, 
consequence did tend to have manifest effects 
on what was tested. 


SUMMARY 


This experiment investigated the question 
of whether some methods of measuring verbal 
learning were more sensitive than others in 
revealing the effects of frequency and con- 
sequence. It was found that the effects of 
frequency were homologous, regardless of the 
test used to measure verbal learning. The 
methods were not equivalent in revealing the 
effects of consequence (reward and punish- 
ment). The more difficult the particular test 
used to measure verbal learning, the lower 
was S’s absolute score and the more likely 
to appear were the effects of reward and 
punishment. The implications of these findings 
for students of projective techniques and 
investigations of social perception were in- 
dicated. 
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impossible but not the improbable. The 

former demand evokes the sense of fancy, 
of wonder, and of faith; the latter, a set to 
judge on the basis of everyday, ponderable 
experience. With this thought in mind we read 
the actuarial summary of Eysenck’s (2, p. 322) 
recent evaluation of the effects of psycho- 
therapy: “Patients treated by means of psy- 
choanalysis improve to the extent of 44 per 
cent; patients treated eclectically improve 
to the extent of 64 per cent; patients treated 
only custodially or by general practitioners 
improve to the extent of 72 per cent. There 
thus appears to be an inverse correlation be- 
tween recovery and psychotherapy; the more 
psychotherapy, the smaller the recovery rate.” 
We are therefore to infer that psychotherapy 
is less effective than no psychotherapy. Are 
we expected to believe the improbable—and 
on the basis of statistics? 

One might be content to rest the matter 
there but for the insightful though militant 
statement of Isaac Ray (5, p. 67), pioneer of 
American psychiatry, who seventy-five years 
ago on a similar issue (the “cult of curability’”’) 
declared: “Statistics which are not really 
statistics are worse than useless; and the reason 
is that they beguile the student with a show 
of knowledge and thus take away the main 
inducement to further inquiry. Why should 
he look further for truth when it already lies 
before him? Some of the prevalent errors re- 
specting insanity and the insane are fairly at- 
tributable to these vicious statistics, for figures 
make a deeper impression on the mind than 
the most cogent arguments.” 

Eysenck’s survey could, moreover, be abused 
by the biased and the uninformed with im- 
portant ill effects socially: if psychotherapy 


T HAS been said that men will believe the 


! This reply to Eysenck does not appear in the same 
journal in which his paper was published because, ac- 
cording to the editor, a policy decision not to print re- 
joinders was arrived at in accepting Eysenck’s contro- 
versial piece. 


does more harm than good, why should it be 
supported financially or by public confidence? 

It is therefore the intent of this communica- 
tion to re-examine critically the data and ar- 
guments set forth by Eysenck in his attempt 
to prove that “‘the figures fail to support the 
hypothesis that psychotherapy facilitates re- 
covery from neurotic disorder” (2, p. 323), 
and briefly to suggest grounds upon which it 
is possible to undertake a veridical evaluation 
of the effects of psychotherapy. In view of the 
strongly supportive role played in Eysenck’s 
analysis by the comparable surveys of Landis 
(4) and Denker (1), these earlier appraisals 
will also be incidentally examined. 

To be noted at the outset is the difference 
between Landis and Eysenck in appreciating 
the difficulties involved in making any ap- 
praisal of the effects of psychotherapy. Landis 
(4, p. 156) summarizes the sources of these 
difficulties—ignorance of the nature or cause 
of mental disease, disagreement among experts 
concerning even such broad differentiations 
as somatogenic or psychogenic, lack of uni- 
formity with respect to catagories of improve- 
ment—and concludes: “Because of these dif- 
ficulties, it is apparent that statistical figures, 
rates of recovery, etc., have to be evaluated 
cautiously, precisely, and with a minimum of 
generalization.” The caution thus advised is 
in contrast with Eysenck’s approach. Not only 
does Eysenck generalize freely, but in his paper 
one repeatedly encounters a polite bow of 
recognition to the sources of difficulty out- 
lined by Landis which are then lightly dis- 
missed. Confronted with the problems in- 
volved in assembling data from 24 separate 
studies, each of which employed its own 
methods and criteria, Eysenck indicates several 
reconciliatory devices he adopted, then adds 
(p. 322): “The total number of cases involved 


" in all these adjustments is quite small. Another 


investigator making all decisions in exactly 
the opposite direction to the present writer’s 
would hardly alter the final percentage figures 
by more than 1 or 2 per cent.” Again, con- 
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sidering the definition of neurosis and offering 
his redefinition, he recognizes (p. 321) that a 
“degree of subjectivity is probably implied 
in the writer’s judgment as to which disorders 
and diagnoses should be considered to fall 
under the heading ‘neurosis,’ ”’ but, as before, 
he concludes with facility: “The number of 
cases where there was genuine doubt is prob- 
ably too small to make much change in the 
final figures, regardless of how they are allo- 
cated.” When he presents his procedure for 
overcoming the diversity in gradations of im- 
provement as reported in the various studies, 
he remarks (pp. 320-321): “A slight degree of 
subjectivity inevitably enters into this pro- 
cedure, but it is doubtful if it has caused much 
distortion.” In taking up the serious questions 
as to whether the patients in his control (no 
treatment) groups were as seriously ill as those 
in the experimental groups and whether 
standards of recovery possibly differed in the 
two divisions, he offers some brief speculations 
and then concludes (p. 322): “In the absence 
of agreed standards of severity of illness, or of 
extent of recovery, it is not possible to go 
further.” The meaning here would seem to be 
that it is not possible to go beyond speculation 
but this admission in no wise deters the author 
from the sweeping conclusions he then pro- 
ceeds to draw. 

The foregoing resumé, largely in the author’s 
own words, will be appraised in the ensuing 
re-evaluation. The argument will take the 
following course: 

A. What is psychoneurosis? (It should be 
noted that Eysenck selects the psychoneuroses 
in making his evaluation.) Is Eysenck’s re- 
definition of neurosis consistent with a true 
evaluation of the effects of psychotherapy in 
this type of disorder? Is the severity of illness 
comparable in his contrasted groups? 

B. What is psychotherapy? Did the two 
control subgroups instanced by the author 
actually receive no psychotherapy? In other 
words, does the control group control, and is 
the so-called base line basic? 

C. What is improvement or recovery? Were 
the criteria for successful outcome as applied 
in the control and experimental groups identi- 
cal or even comparable? 


A. Wuat Is PSYCHONEUROSIS? 


As has already been noted, Eysenck ac- 
knowledges a degree of subjectivity in his 
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judgment as to what disorders should fall 
under this diagnostic category—the critical 
one for his study. His decision (and redefini- 
tion) is as follows (p. 321): “Schizophrenic, 
manic-depressive, and paranoid states have 
been excluded; organ neuroses, psychopathic 
states, and character disturbances have been 
included.” On this basis, which many would 
question (particularly because of the inclusion 
of psychopathic states), he then proceeds to 
redistribute the data in the various reports 
included in his survey. It is his opinion, as 
quoted above, that these reallocations are too 
unimportant “to make much change in the 
final figures,” but it can be shown by a single 
example that this sanguinity on his part is 
not justified. The illustration is afforded by 
the data of Fenichel’s (3) report from the 
Berlin Psychoanalytic Institute—a _ report 
utilized by both Eysenck and Landis. If one 
compares the number of psychoneurotics 
treated at that Institute as re-reported by 
Landis and by Eysenck, one finds that this 
figure is given by the former author as 312 
and by the latter as 484. Since the total number 
of cases accepted for treatment at the Berlin 
Institute in the ten years covered by the re- 
port was 604, it can be readily appreciated 
that Eysenck’s redefinition has caused a shift 
in diagnosis of 172 cases or 28 per cent. The 
effect on the figures for improvement and re- 
covery is even more germane and the difference 
between the estimates given by Landis and 
by Eysenck for this same Berlin group is there- 
fore noteworthy: Landis reports 58 per cent 
improved or recovered, and Eysenck, under 
this same heading (his Table 1), lists 39 per 
cent—a difference of 19 per cent. These dif- 
ferences of 28 per cent and 19 per cent are 
scarcely the negligible quantities that Eysenck 
has disarmingly led us to expect would be the 
result of his modified definition. 

Of equal significance is the question as to 
whether the severity of the illness in Eysenck’s 
control and experimental groups can be con- 
sidered to be comparable. To establish his 
base line for successful outcome without psy- 
chotherapy, Eysenck employed the figure of 
72 per cent given by Landis as the consolidated 
amelioration rate of New York state hospitals 
(1917-1934), and the identical figure of 72 
per cent (for successful treatment within two 
years) obtained from the report of Denker 
(1) for a group of 500 psychoneurotic dis- 
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ability claims treated by general practitioners 
throughout the United States. These control 
subgroups are compared with 24 experimental 
groups subdivided by Eysenck, according to 
the method of psychotherapy, as either psycho- 
analytic or eclectic. 

If one now returns to the question raised 
as to the severity of illness in the several 
groups, the following interpretations seem 
warranted. The insurance disability cases 
were, as a whole, in all likelihood less severely 
ill than any of the others. Denker points out 
(1, p. 2165) that in these cases where dis- 
ability income was a factor the illness may 
have been prolonged by this tangible secondary 
gain. By the same token the illness may very 
well have been initiated, or at least partially 
instigated, by conscious or unconscious pros- 
pects of such gains. To compare psycho- 
neuroses of long standing, dating in many 
instances from early childhood (the typical 
case treated by psychoanalysis), with such 
disability neuroses is highly dubious, and the 
fact that the latter would have cleared up 
quickly after brief treatment by a general 
practitioner is thus not surprising. At the other 
extreme from these disability patients are 
those cases, cited after Landis, which were 
institutionalized in the various New York 
state hospitals. Here one could reasonably ex- 
pect that the neuroses must have been extra- 
ordinarily severe in order for these patients 
to have become eligible for admission to these 
crowded institutions. In these instances the 
outcome of treatment would be expected to 
be far less favorable than for either the Denker 
control group or the experimental groups. 
But at this point a question arises as to the 
standards of recovery which would apply for 
discharge from a state hospital as compared 
with the criteria of recovery utilized by a 
psychoanalyst or psychiatrist in private prac- 
tice. To this problem we shall return later. 
For the present it may be concluded that, in 
general, the Denker base-line group was prob- 
ably less seriously ill, the Landis control group 
more seriously ill than the various experimental 
groups instanced by Eysenck. To the degree 
that this conclusion is sound it may be further 
inferred that the control and experimental 
groups fail to meet an essential criterion of 
comparability—illness severity. When this con- 
sideration is added to the previous one, con- 
cerning Eysenck’s redefinition of neurosis as 


applied to the various studies, the basis for his 
generalizations is seriously called into question. 


B. Wat Is PSYCHOTHERAPY? 


A second point of difference between the 
control and experimental groups in Eysenck’s 
survey concerns treatment: the control groups 
of Landis and Denker are presented as not 
having received psychotherapy in contrast to 
the experimental groups which did. The treat- 
ment given the latter groups is merely desig- 
nated as psychoanalytic or eclectic. Since 
Eysenck at no point defines what he means by 
psychotherapy, it becomes necessary to ex- 
amine the several specific mentions of psycho- 
logical treatment or lack of it in his survey, 
the main problem here being whether his as- 
sertion is supported that the control groups 
did not receive psychotherapy. 

If one turns first to the Denker group, it is 
discovered that in Eysenck’s words (p. 320) 
these patients were “regularly seen and treated 
by their own physicians with sedatives, tonics, 
Suggestion, and reassurance, but in no case 
was any attempt made at anything but this 
most superficial type of ‘psychotherapy’ which 
has alway; been the stock-in-trade of the 
general practitioner.” On the basis of this 
description it is hardly possible to agree that 
these patients were not psychotherapeutically 
treated. The various presumably nonpsycho- 
therapeutic techniques mentioned include sug- 
gestion and reassurance —well-known methods 


of psychotherapy; and psychiatrists regularly , 


use sedatives and tonics as adjuncts to their 
practice. That one is actually dealing here 
with psychotherapy—if not, to be sure, with 
psychoanalysis—becomes eminently clear when 
it is noted that among the experimental groups 
cited by Eysenck 80 per cent were treated by 
eclectic methods (see Table 1, B). What would 
these eclectic methods be if they did not in- 
clude the very techniques attributed to the 
general practitioner? The only difference be- 
tween the work of the general practitioner and 
of the eclectic psychiatrist that could be as- 
sumed, in the absence of detailed and specific 
knowledge, would be a difference in thorough- 
ness or expertness, not a difference in kind. 
And being aware that some general practi- 
tioners working with patients well known to 
them are excellent eclectic psychotherapists, 
one would have to make even this qualification 
very guardedly. A reading of Denker’s paper 
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makes it evident that he himself does not re- 
gard his 500 disability cases as untreated by 
psychotherapy; he assumes only that the 
general practitioner is less expert than the 
psychotherapeutic specialist. 

A similar inference is warranted with respect 
to the Landis control group. To maintain that 
neurotic patients admitted to state hospitals 
receive no psychotherapy is seriously open to 
doubt. These institutions, despite their no- 
torious shortage of staff, usually make a special 
effort to treat their neurotic admissions, be- 
cause these cases have a better prognosis, and 
because they are far more accessible to treat- 
ment. The methods employed would pre- 
sumably be eclectic, like those of the Denker 
study and of the various eclectic experimental 
groups. 

It must then be concluded that the control 
subgroups cited by Eysenck do not sharply 
differ from the experimental groups in respect 
to the important variable of having received 
psychotherapy. As before with regard to ill- 
ness severity, the necessary contrast between 
the base line and the experimental groups be- 
comes markedly attenuated. 


C. Wat Is RECOVERY? 


The crucial question in the present re- 
evaluation is, finally, whether the degree of 
improvement or recovery in the control and 
experimental groups can be regarded as equal. 
In other words, the control base line for per- 
centage of cases cured can attain true sig- 
nificance only if the degree of improvement 
for these cases is identical with, or, at least, 
closely similar to that which the experimental 
cases may be estimated to have achieved after 
intensive psychotherapy. To determine, on 
the available evidence, the answer to the ques- 
tion thus posed is the last step in this re- 
evaluation. 

Needless to say, degree of improvement is 
extremely difficult to assess and the difficulty 
is increased when one is dealing at second hand 
with cases treated by diverse methods and by 
various therapists. The most significant ob- 
stacle to the evaluation of degree of recovery 
lies, however, in the differences in improve- 
ment standards. In the present instance it is 
this particular difficulty which looms large. 
In view of the fact that the Denker group has 
already been shown to represent, in all prob- 
ability, a less severe degree of illness, this 
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group will not here be further discussed. For 
the control group, attention will be focused 
on the far more representative and more gener- 
ally important base-line figure from Landis, 
derived from the consolidated amelioration 
rate of New York state hospitals for 1917-34; 
for the experimental groups both the eclectic 
and the psychoanalytic will be considered. 

As has been already stated, Eysenck reports 
that patients treated by psychoanalysis im- 
prove to the extent of 44 per cent, those treated 
eclectically, 64 per cent, and those treated 
custodially, 72 per cent. It was these figures 
which at the outset challenged us to believe 
the improbable. It will be the burden of the 
present section to show that the improbability 
resolves itself largely into differences in the 
presumed standards of improvement invoked 
in these three treatment pools. 

We may begin with two brief statements 
found in Landis (4). In discussing his Table 
II, which deals with the percentage of patients 
discharged from mental hospitals as recovered 
or improved, he characterizes (p. 159) the 
presented figures as indicating to what extent 
“hospitalization of one year or less yields 
sufficient improvement for favorable discharge” 
(my italics). By contrast, in describing the 
criterion employed at the Berlin Psycho- 
analytic Institute for recovery Landis (p. 162) 
paraphrases Fenichel (3) thus: “Only those 
cases were classified as recovered in which 
the success consisted in the disappearance of 
symptoms, and which also underwent an es- 
sential change which was completely explicable 
from the rational, analytical viewpoint.” (A 
less rigorous standard was, of course, employed 
by Fenichel for improvement, etc.) In other 
words, while patients residentially treated are 
generally considered in terms of hospital dis- 
charge and return to the community, the cri- 
terion of social recovery being highly relevant, 
patients nonresidentially treated, as by psy- 
choanalysis, live continuously in the com- 
munity and are worked with in terms of radical 
therapy which, if successful, permits them to 
live not only with others but with themselves. 
This difference in therapeutic goal is so great 
that percentage figures for residential and non- 
resident treatment are dubiously commensur- 
able. 

The fact that improvement or recovery as 
defined in relation to hospital discharge may 
reflect an extremely low standard becomes 
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patent if one notes, for example, that in 
Table I of Landis, which presents the number 
of patients discharged annually as recovered 
or improved per one hundred admitted to state 
mental hospitals, the figure given for psycho- 
pathic personality is 75. Anyone having the 
slightest familiarity with this type of case is 
aware that such patients are in only the 
vaguest sense cured or improved at discharge. 
If 75 per cent of them are returned to the com- 
munity, this figure can only mean that they 
have temporarily made a social recovery. The 
majority of them wiil doubtless be back again 
if they do not in the meantime gain admission 
to some other correctional institution. One is 
dealing here with a striking illustration of the 
comparatively low recovery standards inherent 
in state-hospital discharged. 

It would, however, be incorrect to lump to- 
gether state-hospital discharges and discharges 
from intensive-treatment institutions like the 
New York Psychiatric Institute or Maudsley 
Hospital as if the same standard of recovery 
prevailed in both. We are here presumably 
dealing with a gradation not only as between 
nonresidential and residential improvement 
standards but as between standards in inten- 
sive-treatment hospitals and state hospitals. 
It must therefore be recognized that resi- 
dentiality of treatment is only an approximate, 
and by no means a perfect, criterion of the 
rigcrousness of improvement standards. 

The figures for residential and nonresidential 
treatment in the control and experimental 
groups cited by Eysenck were accordingly de- 
termined and will now be presented as an ap- 
proximate index of recovery standard. As indi- 
cated already, only the Landis control group 
was considered; it may be stated forthwith 
that the state-hospital patients included in it 
were residentially treated. For the experi- 
mental groups surveyed by Eysenck, it was 
found by a study of the available reports in- 
cluded in his survey that approximately 4,040 
of 5,262 eclectic cases, or 77 per cent, were 
residentially treated; of the 760 psychoanalytic 
cases, none were so treated.” 


* The figures here presented were derived from the 
reports cited by Eysenck in Table 1. (See his list of 
References.) The following proved to have been treated 
nonresidentially: Psychoanalytic—all five groups (total 
N = 760); Eclectic—Huddleson, Neustatter, Luff and 
Garrod (two groups), Yaskin, Carmichael and Masser- 
man, Schilder, Wilder (total N = 1,222). The following 
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We arrive thus at a resolution of Eysenck’s 
paradox. If 44 per cent of neurotics treated by 
psychoanalysis, 64 per cent of those treated 
eclectically, and 72 per cent of those treated 
custodially improved or recovered within two 
years, this sequence of figures does not prove 
the improbability that the more intensive the 
psychotherapy, the less benefit to the patient; 
rather it reflects the probability that the more 
intensive the therapy, the higher the standard 
of recovery. This interpretation is borne out 
by the fact that the three patient pools in the 
order listed above vary in this same order with 
respect to the frequency with which treatment 
was given nonresidentially rather than resi- 
dentially. The figures 44, 64, and 72 correlate 
perfectly with the above-given figures for 
frequency of residential treatment in the three 
patient pools: Psychoanalytic—0 per cent, 
Eclectic—77 per cent, Custodial—100 per cent. 
The implication is: the higher the standard of 
recovery, the smaller the degree of reported 
success.’ 

If this conclusion is thought to depend too 
much on inference, it is sufficient for the present 
purpose to rest this part of the discussion with 
the more modest statement that the standards 
of improvement and recovery in Eysenck’s 
various patient groups, control and experi- 





were treated residentially: Psychoanalytic—none; 
Eclectic—Matz, Maudsley (1931), Ross, Curran, 
Hamilton & Wall, Hamilton, et al., Landis, Miles, et al. 
(total N = 4,040). Several qualifications should be 
noted. The Maudsley reports (Nos. 3 and 4 under B 
in Eysenck’s Table 1) were not available to the writer. 
However, the survey by Wilder includes the 1927-1931 
Maudsley report, and from this source it is inferred that 
these patients were treated residentially. The later 
Maudsley group had to be omitted. The necessary 
figures for the Institute of Medical Psychology group 
(Eysenck’s No. B, 17) could not be found in the source 
cited by Eysenck. The 50 cases reported by Masser- 
man and Carmichael are ambiguous as regards resi- 
dentiality and have therefore not been included. (Ex- 
cluded total NV = 2,031.) 

3 Eysenck has made a slip on p. 320 where he indi- 
cates that his survey of treated groups includes “the 
results of nineteen studies reported in the litera- 
ture... .” Inspection of Table 1 on p. 321 makes it 
evident that his survey actually included 24 studies. 
The total series in the table is divided in two—subseries 
(A) including five groups who received psychoanalytic 
therapy, and subseries (B) consisting of nineteen groups 
who received eclectic psychotherapy. Has the author in 
the error on page 320 tacitly acknowledged the nineteen 
eclectic groups and “repressed” the five psychoanalytic 
ones? 
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mental, bear so little resemblance to each 
other that, once again, the basis of his com- 
parisons has little demonstrable validity. 

The foregoing re-evaluation from the stand- 
points of the definition and severity of neurosis, 
amount of psychotherapy accorded, and 
standards of recovery in the several patient 
groups thus leads to the general conclusion 
that Eysenck’s data and arguments fail to 
support his thesis that psychotherapy cannot 
be shown to facilitate recovery. The need for 
a more circumspect use of statistics in this 
highly complex area of evaluation is under- 
scored. It is not, however, maintained that a 
conclusion in the opposite direction is war- 
ranted. The only safe deduction on the basis 
of currently available data is that, in view of 
the diversity of methods and standards in the 
field of psychotherapy, broad generalizations 
as to the effectiveness of treatment are to be 
avoided. 


One may seek to evaluate the effects of 
psychotherapy either by counting reported 
outcomes or by considering dynamic change 
in the process of treatment, or both. In the 
former instance each patient serves as a poll, 
for or against, as indicated by his subjective 
report or the therapist’s clinical judgment; in 
the latter, one examines each treated person- 
ality as a system of structures and forces 
which, in the course of therapy, is altered in 
a definable direction. To undertake an evalu- 
ation of the effects of psychotherapy by tally- 
ing outcomes at second hand, without even 
introducing the problem of dynamic change 
in various forms of illness and in differing 
therapeutic procedures, and, in default of such 
considerations, to reassign diagnoses and prog- 
noses is to invite the inconsistencies and non 
sequiturs that have been demonstrated in the 
foregoing reanalysis. But it is not to be con- 
cluded that statistical evaluation is totally to 
be eschewed because such pitfalls exist; the 
implication is, rather, that the use of statistics 
in the evaluation of psychotherapy demands 
special precautions and, in addition, the 
guarantee of a concurrent or prior dynamic 
evaluation of each case—an evaluation in 
which the complexity of the individual patient 
and of the therapeutic situation has been fully 
considered. Such a dynamic appraisal must 
take into account the organization of the 
patient’s personality and life space before, 
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during, and after treatment. It must, for ex- 
ample, even allow for such statistically equivo- 
cal outcomes as those in which therapy 
produces no computable decrease in former 
symptoms but an increased capacity to ac- 
cept them and to accept oneself. It must, in 
other words, reflect an understanding by the 
therapist and, at least in part by the patient, 
of the reason or reasons for the alleged out- 
come of treatment. 

In internal medicine and surgery the neces- 
sity for empirical adjuncts to any clinical eval- 
uation of therapeutic change has long been 
recognized—as, for example, when X rays, 
blood tests, biological assays, etc. are invoked 
to determine, quite apart from the patient’s 
subjective report or the physician’s clinical 
judgment, vyhat progress the patient is making 
in healing a fracture or in overcoming an infec- 
tion. Psychology and psychiatry may not yet 
be ready for great precision in the making of 
such independent evaluations, but the last 
decade has seen rapid advances toward this 
goal. One reason for opposing such naive evalu- 
ations of the effects of psychotherapy as the 
one here re-examined is to keep open the 
intrinsically difficult road which such in- 
vestigation is destined to follow. 

What good is psychotherapy? As good as 
man’s faith in his humanity. Men have always 
believed in their ability to change for the better 
and to help each other so to change—through 
mutual assistance, love, religion, and art. Con- 
ceived in the broadest terms psychotherapy 
derives from the same faith and, employing of 
necessity some of the same means, attempts to 
formulate these more precisely. The question 
is not, then, whether psychotherapy does any 
good—one might as reasonably ask, “Is life 
worth living?’’; the question is how does ther- 
apy accomplish its ends in those fortunate 
instances where, despite the adverse odds, it 
manages to succeed. It is to the process, not 
the superficially appraised end result—to the 
disorganization or organization of forces which 
may spell illness and partial death or health 
and growth—that attention should be directed 
if we are to learn anything about psychother- 
apy. If, in aiming at this admittedly more 
difficult kind of assessment, we shall have to 
postpone rigorous quantification or use it 
guardedly, to ensure genuine precision, such 
caution may well constitute a measure of the 
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CASE REPORT 


A CASE STUDY IN A BEHAVIORAL ANALYSIS OF 
PSYCHOTHERAPY’! 


EDWARD J. MURRAY 
Vale University 


to many psychologists today. This is 
partly because psychotherapy is the 
only rational approach to the treatment of 
neuroses and psychoses. It is also because 
psychotherapy is a unique source of data 
about some of the most important and elusive 
processes of human behavior. Yet, there is 
much we need to learn about psychotherapy. 
It is not clear why only some patients improve. 
Current research using measures before and 
after therapy, or comparing therapy with no- 
therapy, yields little useful information; 
such an approach establishes no relationship 
between what actually occurs during therapy 
and the outcome. Thus, there is no rationale 
for gradually improving tactics or understand- 
ing the changes in various kinds of patients. 
In spite of the general descriptions of therapy 
that are available, it is not clear just what 
happens between a therapist and his patient. 
To some extent this is due to the complexity 
and the number of the events in therapy. In 
addition, we have to contend with the sub- 
jectivity of the therapist’s report. What is 
needed is an objective behavioral description 
of psychotherapy. Such a description must be 
comprehensive enough to capture the impor- 
tant events and yet simple enough to clarify 
the complexity of the events. With an adequate 
description of the events in psychotherapy, 
studies on the prediction of therapeutic 
progress from psychological tests, as well as 
studies on the evaluation of therapy using 
outside criteria, will take on new meaning and 
exert more influence on therapeutic conduct. 
The purpose of this paper is to describe a 
first step which was taken in the direction of 
an adequate description of psychotherapy. 
Many events occur during psychotherapy. 
Those which are readily observable may be 
grouped as physiological, gross behavioral, 
and verbal. All three groups have been studied 
(e.g., 3, 10) and should be studied further. 
However, verbal behavior seems to be most 


1 This paper was presented at the 1952 meeting of 
the Eastern Psychological Association, Atlantic City, 
New Jersey. 


Pier is of considerable interest 


critical from many points of view. There 
have been studies on the grammatical and 
formal! properties of verbal behavior in therapy 
(e.g., 2), which are interesting in many ways 
but do not seem to be related to the major 
theories of personality in any determinate 
way. The content or meaning functions of 
verbalization seem much more relevant. The 
general method for studying such material 
is called content analysis. The research on 
psychotherapy done by the Rogers group (10) 
uses content analysis. However, in content 
analysis the categories which one selects are 
determined by theory. The theory guiding 
the Rogerian content analysis appears to be 
a vague preceptual schema. The content 
analysis which we are developing is guided 
by two other points of view: psychoanalytic 
theory and learning theory. In this context 
the most relevant categories are those con- 
cerned with motivation and defense. Thus, 
we propose to study the content of verbal 
behavior in psychotherapy with respect to 
underlying motives and defenses. This study 
illustrates this approach with one psycho- 
therapy patient. 


A CASE Stupy 


The patient was seen for 17 hours in an 
outpatient clinic.? All hours except the thir- 
teenth were phonographically recorded. The 
first hour was omitted because it consisted 
of history taking. The patient was referred 
by the medical clinic. His complaint was that 
“he has trouble getting to sleep at night— 
feels that if he falls asleep he may die. He is 
tremendously threatened but can’t say what 
he is threatened by.” At the time of the first 
interview the patient was described by the 
therapist as follows: 

The patient is a well built, good looking young man 
of 24. His family was once well-to-do, he’s a college 
graduate, and he’s now working in a real estate firm... . 


His parents were divorced when he was eight after a 
protracted period of arguments which the patient re- 


? The author wishes to thank Dr. Larry Hemmen- 
dinger of the Veterans Administration Regional Office, 
Bridgeport, Connecticut, for providing this case ma- 
terial. 
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members as painful to him. Apparently his mother was 
unfaithful and a good deal of the dispute was about this. 
The patient also remembers being taunted about his 
mother’s behavior by one of his companions. Before 
the divorce the entire family moved about and finally 
lived with the maternal grandmother. He was sur- 
rounded by dominating and pampering females. He now 
relates this to a “complex” of going from person to 
person for aid. After the divorce he lived with his aunt. 
His mother remarried, had a child, who is now nine, 
and moved in with the patient and his aunt, which is 
the present home situation. His mother was harsh to 
him and he developed a distaste for her. He wanted to 
be different from her and rejected her “emotional” 
kind of living in favor of a “logical and rational ap- 
proach.” He adjusted to school and social relations 
with boys very easily. He tended, though, to be a 
teacher’s pet in grammar school. He did well in science 
and math in high school and started out in biology at 
college. Upon his return to college after the war he 
changed to the social sciences taking his degree in 
history. His relations with girls were extremely in- 
nocent until he went in the army. He masturbated 
during adolescence in spite of guilt and fear about it. 
The army’s attitude about masturbation enabled him 
to overcome a good deal of this guilt. While in the army 
he went out with a college girl without ever having 
intercourse, avoiding the kind of girls with whom he 
might have had sex and avoiding one situation with 
this girl which might have led to sex. While overseas 
he did have some sexual relationships. Generally he 
seems very cooperative and well motivated. 


Therapy was mainly supportive but in- 
cluded interpretations about the defensive 
nature of his physical complaints and the 
hostility which arose when he became de- 
pendent. A permissive attitude toward the 
expression of hostility was maintained. 

After a considerable period of trial and 
error, two main categories were selected for 
content analysis: hostility and defense state- 
ments. The hostility category was divided 
into six subcategories of hostility: (¢) mother, 
(6) aunt, (c) other people, (d) general situa- 
tions and groups, (e) the therapist, and (/) 
a vague ‘“‘at home”’ (referring to one or more 
persons in his home). The defense category 
was composed of: (a) intellectual defensive 
statements which included the patient’s 
views on philosophy, science, current events, 
etc., and (6) complaints about a wide variety 
of physical symptoms and discomforts. Every- 
thing else was considered irrelevant. 

The unit scored was called a statement. 
This was either a simple sentence or the mean- 
ing phrases of a more complicated sentence. 
Each statement was judged as belonging to 
one of the several categories. The main measure 
was the number of statements in a given 
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category for each hour. Relationships from 
hour to hour between categories, and between 
categories and the behavior of the therapist, 
were determined. 

The record of each hour was played very 
slowly, and each statement was judged either 
as belonging to one of the eight categories or 
as irrelevant. A reliability study with three 
other judges scoring the same hour showed 
that the categories were defined in a way which 
was precise enough for teaching other people 
the method. The reliability was high when 
the eight categories were compared (r = .86, 
89, and .91; p < .01 in all cases), and even 
higher when the irrelevant category was added 
(r = .94, .95, and .98; p < .01 in all cases). 


HosTILITY AND DEFENSES 


Expressing hostility was a major problem 
for the patient. In the summary of the treat- 
ment, the therapist says, “his problems today 
seem centered around dependency and con- 
comitant resentment.’”’ However, the therapist 
did feel that some progress was made in his 
ability to express his hostility as well as to 
see its relationship with his dependency. If 
the patient had trouble expressing hostility, 
then his hostility must have aroused anxiety. 
From the theory of conflict (1, 4) we would 
predict that if hostility is inhibited by anxiety, 
then, if anxiety is reduced by the permissive 
attitude of the therapist, the overt manifesta- 
tions of hostility should increase while the 
overt manifestations of anxiety should de- 
crease. In this study it is assumed that the 
hostility statements are the overt manifesta- 
tion of hostility, and the defense statements 
the overt manifestation of anxiety. Figure 1 
shows the hostility and defense statements 
throughout the course of psychotherapy. 
It can be seen that hostility increases and 
defenses decrease. Moreover, the hour-to- 
hour fluctuations show a true reciprocal rela- 
tionship which strengthens the conflict analy- 
sis. The crisscrossing from Hour 4 to Hour 8 
takes place in between the part of the therapy 
when defenses were high and the later part 
when the expression of hostility was high. The 
correlation is negative and highly reliable (r = 
—.73, p < .01). 

However, the objection might be raised 
that this is not a dynamic relationship between 
the two categories because the total number 
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Fic. 1. FREQUENCY OF Hostiiity AND DEFENSE 
SIATEMENTS THROUGHOUT THERAPY 


of statements is limited and when one category 
goes up the other must come down. This is 
not the case, because, with the exception of 
one hour, pooled hostility and defense state- 
ments never constituted more than 60 per 
cent of the total statements of one hour. An 
objection which can be made to this is that 
there may be nothing unique about the hos- 
tility-defense relationship if the residual or 
irrelevant statements also correlate with 
either hostility or defenses. This objection is 
ruled out by the fact that the residual is not 
correlated with hostility (r = —.20) or with 
defenses (r = .19). Another objection which 
might be raised is that, since the total number 
of statements varied from hour to hour be- 
cause of differences in the patient’s rate of 
speech, the length of pauses, and the number 
of therapist’s remarks, the relationship is an 
artifact of the different totals. That this is 
not the case is shown in Fig. 2, where the 
percentage of the total number of hostility 
and defense statements is plotted.* In this 
form, hostility is still negatively and highly 
reliably correlated with defenses (r = —.75, 
p < .01). Hostility in percentage form cor- 
relates very highly with hostility in the fre- 
quency form in Fig. 1 (r = .96, p < .01). 
This is also true of defenses (r = .94, p. < .01). 

We feel that this result strongly suggests 
that the patient’s anxiety about expressing 
hostility was decreased as a result of the thera- 
pist’s activity and/or inactivity. However, 
this may be limited only to the therapeutic 
situation; outside observations are needed to 
demonstrate any more general conclusion. 


3 The results presented in Figs. 3, 4, and 5 also show 
little change when plotted in percentage form. 
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The verbal changes shown here may or may 
not be indicative of fundamental emotional 
change. Here again, other evidence is needed. 
It is also conceivable that this hostility was a 
defense against a more anxiety-arousing 
sexual problem. But in any event, the lawful- 
ness of the change is encouraging for the use- 
fulness of the method. 


THE DISPLACEMENT OF HOSTILITY 


An examination of the occurrence of state- 
ments in the subcategories of hostility from 
hour to hour reveals a sequence of persons 
toward whom hostility is directed rather than 
a global display on each hour. Figure 3 shows 
hostility to mother, aunt, and a combined 
“others” and “general.’’ The sequence strongly 
suggests displacement. From a psychoanalytic 
point of view we would expect the mother to 
be the recipient of the most basic hostility. 
Mother, aunt, and others form a meaning- 
ful gradient of generalization. The patient’s 
hostility to his mother gradually increased 
up to Hour 6. The therapist’s summary of 
Hour 6 says, “‘he revealed that his mother had 
punished him for masturbating. She also forced 
food upon him and punished him for not eating 
promptly. He was ‘too little to fight back’ 
but did act in a spiteful way several times. He 
described other hostilities and retaliations 
with respect to his mother.” The following 
hour showed little hostility. Then for several 
hours he expressed hostility to his aunt. This 
is viewed as primarily a displacement from 
the mother, although the aunt elicited her 
own share of hostility. Following this there 
was a displacement to other people. We ex- 
pected the sequence to reverse itself, i.e., 
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after the hostility to others, hostility would 
be expressed first to the aunt again and finally 
to the mother again. However, a dramatic 
external event took place between Hours 15 
and 16: his aunt was suddenly taken to the 
hospital for an emergency cancer operation. 
The therapist says that during Hour 16, 
“after considerable struggle he finally stated 
an ambivalency about the event.” Further- 
more, his hostility to his mother was increased 
because she had hinted that if the aunt died, 
she would take over the home. 

It should be noted that the learning analysis 
of displacement (5), which has been confirmed 
in several animal experiments (6, 7, 9), does 
not predict sequences of hostility to various 
objects. Since it is desirable to make such 
predictions, we embarked on an extension of 
displacement theory (8) and an experimental 
investigation, in collaboration with Mr. 
Mitchell Berkun, of displacement with animals 
in a free-choice situation. Suffice it to say that 
if rats are both rewarded and punished at a 
given goal so as to establish a conflict, they 
will still approach that goal part way before 
displacing to another goal. This is an example 
of how precise data from psychotherapy can 
influence theories of behavior based on animal 
work. 

The fact that hostility was directed toward 
people who were further and further displaced 
provides an alternative explanation for the 
general increase in hostility throughout 
therapy. If expressing hostility to these people 
aroused less anxiety than expressing hostility 
to his mother, we would expect more and more 
hostility as therapy progressed. Probably 


both displacement and a general reduction of 
anxiety because of the treatment operated to 
permit hostility to increase. The hostility 
expressed to the patient’s mother in Hour 16 
was much stronger and was concerned with 
much more recent events than was the hos- 
tility in Hour 6. Thus, there was a therapeutic 
effect. Indeed, expressing hostility to relatively 
unimportant people may be therapeutically 
valuable because it is less fearful. In learning 
theery terms, fear is extinguished in the dis- 
placed situation and these extinction effects 
are generalized back to the fear in the primary 
conflict situation. In the experiment mentioned 
above, the rats returned to the original goal 
after making unpunished goal responses in the 
displaced situation. 


DEFENSES 


The two defense categories seemed to 
operate as alternate members of a defensive 
armamentarium. In Fig. 4 it can be seen that 
the intellectual defense was high at the outset 
and decreased throughout the first part of 
therapy. This intellectual defense was never 
interpreted. The physical complaint defense 
increased as the intellectual defense decreased. 
After the major interpretation in Hour 5, 
the physical complaint defense dropped off 
sharply. In Hour 6 both defenses were low, 
and this hour proved to be especially fruitful, 
as was noted above. This expression of hos- 
tility also increased his anxiety, and the 
following hour was a dull one. Moreover, with 
the physical complaint defense interpreted, 
the noninterpreted intellectual defense in- 
creased in a compensatory way in this subse- 
quent hour. Both defenses decreased during 
the rest of psychotherapy and show no dis- 
tinguishing features as far as the number of 
responses is concerned. The interpretation of 
the physical complaints may have functioned 
as a punishment, or may have established 
insight. Further indices are needed to dis- 
tinguish between the two possibilities. 

There is also a possibility that the two 
defenses had different functions. The intel- 
lectual defense was more assertive and self- 
aggrandizing. It may have been a way of 
telling the therapist that he was masculine 
or mature. On the other hand, the physical 
complaints had a pleading and ingratiating 
tone. This defense may have been motivated 
by feminine or dependent needs. However, 
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both may be viewed as motivated by anxiety 
and both served the function of avoiding talk 
about conflict areas. 

Fortunately, we examined another case 
which showed some striking similarities to 
this one. The general personality picture was 
the same as the present case. In addition, the 
two chief defenses were similar to the physical 
complaints and intellectualizations charac- 
terizing our present case. The second patient 


had physical complaints, although he had 
more insight into their psychogenesis. A cate- 
gory comprising physical complaints and more 
psychologicality phrased feelings of tension, 
conflict, and blocking was defined. The in- 
tellectual defense was quite similar to the 
first case. Figure 5 shows the course of physical, 


etc. defensive statements and _ intellectual 
defensive statements throughout the eight 
hours of therapy. This second patient began 
with physical complaints which decreased 
from Hour 1 to Hour 3 without interpretation. 
As the physical complaints decreased, the 
intellectual defense increased. This was in- 
terpreted in a punitive way in Hour 2 and, 
less severely, in Hours 3 and 4. Both defenses 
were low in Hours 3 and 4 and it was in these 
hours, both in our opinion and in the unso- 
licited opinion of the therapist, that the most 
important hostile material came out. In Hour 
5 it was the uninterpreted defense—physical 
complaints—that showed the greatest increase. 
In this case, the therapist then proceeded to 
interpret the physical complaint defense on 
Hours 5 and 6. Both defenses were low for 
the last two hours. 

This second case in a sense provides a natural 
experimental control for the first case. In the 
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first case intellectual defenses were highest at 
first and decreased without interpretation; 
in the second case this was true of physical 
complaints. In the first case physical com- 
plaints supplanted intellectual defenses; in 
the second case intellectual defenses sup- 
planted physical complaints. In the first 
case the physical complaints met with the 
disapproval of the therapist; in the second 
case this was true of intellectualizations. In 
both cases the punitively interpreted defenses 
decreased. In both cases important hostile 
material emerged when both physical com- 
plaints and intellectualizations were low. In 
both cases the expression of hostility was 
followed by an upswing in the uninterpreted 
defense. These results strengthen our belief 
that the physical complaints and intellectuali- 
zations are alternate defenses against anxiety. 
They also tend to confirm the general conflict 
analysis we have made. The increase in de- 
fenses after the expression of hostility may 
be related to the “negative therapeutic effect”’ 
(1). We may also tentatively formulate the 
hypothesis that, with this kind of patient, 
uninterpreted (or unpunished) defenses have a 
greater probability of occurrence when anxiety 
is increased than interpreted (or punished) 
defenses. It is obvious that much more evi- 
dence is needed before making any definite 
conclusions. 


SUMMARY AND CONCLUSIONS 


In our present state of knowledge about 
psychotherapy, final conclusions should be 
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avoided. Therefore, it is with trepidation that 
we approach the task of making a comprehen- 
sive summary statement about our illustrative 
case. This is true even though the therapy was 
brief and the changes were probably not 
fundamental. We might ask about the success 
of this case. The therapist does not sound 
optimistic in his summary: “The patient now 
has insight into his problems but can’t extri- 
cate himself from them. Because of a general 
character weakness it is improbable that short 
term therapy will be of much more value.” 
But our feeling is that it is important to 
understand what has happened during these 
interviews, whether or not the therapy was 
eminently successful. We feel that our ob- 
jective description aids in this. Here is a 
tentative recapitulation of the case in terms 
of the data presented in this study. 

The patient began treatment with a good 
deal of anxiety and subsequent defensiveness. 
His defensiveness decreased as a result of a 
combination of permissiveness about hostility 
and punitiveness about defenses on the part 
of the therapist. As this occurred he expressed 
strong hostility to his mother. This expression 
of hostility led to an increase in anxiety and 
defensiveness. The defense which increased 
was the one not previously punished by the 
therapist. Subsequently, hostility was dis- 
placed further and further away from his 
mother. Hostility to displaced objects was 
stronger because it aroused less anxiety. It is 
possible that because of the unpunished ex- 
pression of hostility to the displaced objects, 
the patient was able later in therapy, when 
environmental factors precipitated it, to 
express hostility about his mother much more 
strongly, at least in the therapeutic situation. 

Although this integration involves several 
assumptions, we feel it is strongly supported 
by the data. It will be noted that the state- 
ments made in the summary concern motiva- 
tional and defensive shifts. This is what the 
categories were set up to measure. Other 
events, such as the establishment of insight, 
require other kinds of categories. Thus, we 
offer no evidence to support the therapist’s 
opinion that the patient understood the rela- 
tionship between his hostility and his de- 
pendency at the end of therapy. 

This preliminary study has been presented 
to illustrate the kinds of results which can 


be obtained by studying psychotherapy care- 
fully and quantitatively. We feel that it also 
demonstrates the applicability of principles 
derived from animal and human experimenta- 
tion to complex human behavior. On the 
other hand, it is only data as clear as these 
that will stimulate experimental work and 
modify existing theories. We also hope that 
this study has indicated some of the difficulties 
involved in a behavioral analysis of psycho- 
therapy. Cases do not miraculously become 
comparable when studied quantitatively. New 
cases are expected to present as many problems 
as they solve. Nor do we feel that all of the 
important events in psychotherapy can be 
described by this method. Indeed, many im- 
portant things in this case have been ignored. 
But, as this method becomes more refined, 
and as equally objective measures are added 
to it, we feel that our understanding of psycho- 
therapy, and human behavior in general, will 
be furthered. 
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THE USE OF TAPE RECORDING TO SIMULATE A GROUP ATMOSPHERE’ 


ROBERT R. BLAKE? 
The University of Texas 


HE present study explores the possibility that 

| tape recordings can be used to communicate 

to a test subject the experience that he is a partici- 

pating member of a social group. If this method 

can be employed to investigate the impact of group 

membership on individual behavior, its research 
applications should prove to be numerous. 


THE EXPERIMENT 
A pplication of the Method 


To test the feasibility of this approach for a 
standard problem, a brief description of a study 
based on the autokinetic phenomenon will be 
given (1). The situation used is comparable in 
many respects to that employed by Sherif (2) in 
his investigation of autokinetic judgments in the 
group context. The important differences are that 
(a) in our situation the subjects (Ss) understand 
that each is experiencing the same amount of 
movement, but that they are looking from different 
observation posts, as would be the case, for ex- 
ample, on shipboard (thus accounting for the fact 
that they are not physically together); and (6) our 
Ss used the millimeter scale, rather than inches, as 
the unit through which they express their judg- 
ments. 

As Sherif originally reported, judgments of the 
movement experienced in the autokinetic situation 
are responsive to group pressures, the rule being 
that Ss’ responses tend to converge on a norm 
established by the group. Given that finding, it is 
predicted that the uninstructed S experiencing the 
present situation will tend to locate his judgments 
somewhere within the range of reports given by 
the “taped” Ss. If he does give responses within 
this range and if the range used is a most improb- 


1The study was conducted in the Laboratory of 
Social Relations, Harvard University, and was spon- 
sored in part by the United States Air Force under 
contract No. AF 33(038)-12782 monitored by the 
Human Resources Research Institute. Permission is 
granted for reproduction in whole and in part by or for 
the United States Government. The cooperation of the 
United States Navy in facilitating the conduct of the 
research is also gratefully acknowledged. The authors 
express their appreciation to Dr. Henry W. Riecken 
for his help with design and administrative problems. 

2 Leave of absence from The University of Texas, 
1951-1952. 
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able one, as inferred from the frequency with which 
responses are given within it by Ss who judge in a 
situation without social stimulation, this would 
constitute evidence for the conformance-producing 
power of the recorded group situation. 


Explanation to the Subjects 


After taking him to the darkroom and adjusting 
his head-set and power microphone, the experimenter 
(£) left, informing S that he and the rest of the group 
who were located in adjoining rooms would be given 
further instructions over headphones. The 5S was also 
told that his code letter would be “G.” The E then 
returned to the control room, started the tape recorder, 
and turned on a switch completing the circuit to S. 
The explanation that followed was addressed to the 
“taped” participants as well as to the uninstructed, 
critical S. The recording ran as follows: 

Experimenter: All right, here we go. I’m talking to 
you from our control room. Now I’ll tell you all what 
I want you to do. But first I want to know if you can 
hear me all right. If you can, please answer okay when 
I call your code letter. Subject A (okay); subject B 
(okay); . . . subject E (okay); subject F (okay); subject 
G (four-second space). All right, that’s fine. 

Now here’s what the experiment is all about. Navy 
men on board ship or flying occasionally have to judge 
the distance which a light moves on the horizon. The 
Navy needs to know how men make such judgments 
when the conditions are poor—such as when it’s dark. 

In this experiment each of you will estimate how far 
a light moves in the dark. You will then report your 
estimates. As on board ship, you will be able to hear 
each other’s reports. The answers you all give will then 
be used by our control room to make a single estimate 
of the distance the light moved. Since a person cannot 
be very accurate under these conditions, the combined 
judgments of the group will be much better than the 
individual reports. It’s the group performance which 
makes the difference between a hit and a miss in this 
kind of work. 

Now here are your instructions. Please listen care- 
fully. 

When the experiment begins, a light will appear 
in front of you. Shortly after it appears it will begin 
to move. It will only be visible for a few seconds. Your 
task is to estimate the total distance the light moves. 
As soon as it stops, you will each be called by your code 
letter to give your estimate of how far it moved. Give 
your report in millimeters. (The Ss were shown a 
millimeter scale before the beginning of the experi- 
ment.) 

Are there any questions? 
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Subject D: Yeh, I’ve got a question. Does this light 
move the same amount for all of us each time it 
comes on? 

Experimenter: Yes, we're doing this just like we did 
before. Okay? Ready for trial one. (Ten-second space 
in the tape.) Now give me your reports to the nearest 
millimeter; first, subject A-—(69); subject B . . . subject 
G—(five-second space). Ready for trial two. (Eight- 
second space). Etc. 

At the end of trial 10, E said, ““That’s the end of 
this part of the experiment. Please remain in your 
seats until I bring you a questionnaire to fill out.” 

The £ then asked the uninstructed S to fill out a 
questionnaire. After discussion and explanation of what- 
ever part of the experiment he was interested in, S 
was released, As far as could be determined, no one 
diagnosed the concealed features of the study. 


Subjects 


The Ss participating in this study, 84 in number, 
were drawn from Naval enlistees and Harvard Uni- 
versity undergraduates. 


Experimental Design 

Three sets of conditions using simulated groups and 
one condition involving judgment in the “alone”’ situa- 
tion were employed. The three synthetic conditions 
involved 10 trials, each consisting of six numerical 
values spoken by the “participants” as their judgments 
before the uninstructed S was called to give his report. 
The 10 trials varied in two ways: they had five different 
ranges and they were built on two different scale cen- 
ters. In all cases the numerical distances between 
individual judgments were of equal amounts, and no 
numerical value was repeated in the same trial. Thus 
the median and mean coincided at the center for any 
given scale. 

For the diverging condition, scale centers alternated 
from 46 to 57, starting with the low center. The diverg- 
ing conditions started with a small range, with the 
range size increasing as the trials progressed. Trial 
one had a range of 15, trial two of 15, trial three of 20, 
and so on up to trials nine and ten, each of which had a 
range of 45. The converging condition represents an 
exact duplication of the diverging one but in reverse 


TABLE 1 


RANGES, SCALE CENTERS, AND NUMERICAL VALUES SPOKEN BY RECORDED SUBJECTS UNDER THE SYNTHETIC 
Group CONDITIONS 








Conprt10on I* 
DrIvERGING RANGE 


ConnprT10n III 
NONOVERLAPPING RANGE 





TriaL RANGE CENTER Stimutvus VALUES 


TrraL RANGE CENTER Stimulus VALUES 





1 15 45.5 38-41-44-47-50-53 
2 15 56.5 49-52-55-58-61-64 
3 20 46.0 36-40-44-48-52-56 
4 20 57.0 47-51-55-59-63-67 
5 30 46.0 31-37-43-49-55-61 
6 30 57.0 42-48-54-60-66-72 
7 40 46.0 26-34-42-50-58-66 
8 40 57.0 37-45-53-61-69-77 
9 45 45.5 23-32-41-50-59-68 
10 45 56.5 34-43-52-61-70-79 


1 35 51.5 34-41-48-55-62-69 
2 5 30.5 28-29-30-31-32-33 
3 30 52.0 37-43-49-55-61-67 
4 10 31.0 26-28-30-32-34-36 
5 20 52.0 42-46-50-54-58-62 
6 20 31.0 21-25-29-33-37-41 
7 10 52.0 47-49-51-53-55-57 
8 30 31.0 16-22-28-34-40-46 
9 5 51.5 49-50-51-52-53-54 
10 35 30.5 13-20-27-34-41-48 














* Condition II is the same as condition I but in reverse sequence. 


TABLE 2 


Tue FREQUENCY OF RESPONSES LOCATED WITHIN THE SOCIAL RANGE CREATED BY THE SIMULATED (RECORDED) 
Group, CONTRASTED WITH THE COMPARABLE FREQUENCY FOR THE NONSOCIAL SITUATION 

















ConprTi0n I Conprt1on II ConniT10n III 
CONTROL Exp. CONTROL Exp. CONTROL Exp. 
TRIAL FREQ. FREQ. ? Freq. FREQ. p FREQ. Freq. 
(N = 27) (N = 20) (N = 27) (N = 20) (N = 27) (N = 17) 
1 1 6 <.05 2 16 <.01 2 17 <.01 
2 1 5 <.10 4 12 <.01 2 9 <.01 
3 1 13 <.01 1 16 <.01 1 17 <.01 
4 0 12 <.01 2 11 <.01 2 13 <.01 
5 i 16 <.01 0 13 <.01 0 14 <.01 
6 1 12 <.01 2 11 <.01 4 14 <.01 
7 4 18 <.01 1 11 <.91 1 9 <.01 
. 4 17 <.01 4 10 <.20 4 17 <.01 
9 6 13 <.01 2 8 <.05 2 2 >.95 
10 3 15 <.01 0 6 <.01 4 11 <.01 








| 








Use or TAPE RECORDING TO SIMULATE A GRrouP ATMOSPHERE 


since it began with wide ranges and ended with narrow 
ones. The nonoverlapping condition alternated between 
scale centers of 31 and 52, starting with the high value. 
However, in this condition ranges alternated from large 
to small. These ranges were designed in such a way 
(varying from 5 to 35 in width) that adjacent ranges 
never overlapped. The composition of each condition 
is reproduced in Table 1. In the actual experiment, the 
recorded order of numerical values within each trial 
was randomized. 

The fourth condition, introduced to furnish evidence 
for judgmental tendencies in the alone situation, 
provides the control data. Operating under conventional 
autokinetic conditions, but using exposure periods of 
the same length as those employed for the simulated 
group conditions, 27 Ss simply judged the distance 
of movement for 10 consecutive trials. 


RESULTS 


Table 2 shows the frequency with which control 
Ss gave responses within the range of numbers 
spoken by the recorded Ss under each of the three 
conditions outlined above, as contrasted with the 
frequency with which uninstructed Ss located 
their responses within the same range. It shows 
that under each of the synthetic group conditions 
more than 50 per cent of the responses fell within 
the spoken range of the recorded Ss, while only 
about 4 per cent of the responses given under the 
alone condition were located within the same 
region. The differences, which are of high statistical 
significance, point to the conclusion that under 
each of the three conditions the responses of the 
uninstructed Ss operating as members of “syn- 
thetic” groups were changed in the predicted 
direction. 


DISCUSSION AND CONCLUSIONS 


In this study a group frame of reference was 
created for the autokinetic effect solely by auditory 
stimulation. The reports of all “other” Ss were 
recorded and communicated to the critical S over 
headphones. Eighty-four Ss distributed among 
three experimental and one control condition were 
asked to give 10 autokinetic judgments. For the 
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experimental conditions, judgment was made after 
S listened to recorded reports from six other people. 
A control was provided by 27 Ss, each of whom 
made 10 judgments under the nonsocial condition. 
Owing to effective concealment of the critical 
feature in the study, the test Ss were, as far as 
could be determined, unaware that the judgments 
they heard were given by persons not actually 
present in other rooms. 

It was hypothesized that under these conditions 
an S would locate his judgments within the range 
of reports given by the group as a whole. Statistical 
evaluation indicated that the frequencies of judg- 
ments falling within the recorded social ranges were 
significantly greater for Ss who judged with others 
than they were for those who judged in the alone 
situation. It may be concluded that pressures to 
change can be created by skillfully prepared record- 
ings which simulate the conditions of a “live” 
experimental situation. 

These results suggest that the tape method of 
inducing the experience of group membership may 
facilitate the conduct of group research in several 
ways. For example, the conventional procedure of 
using paid participants entails at least three diffi- 
culties: changes within the experimental procedure 
due to variations in the behavior of participants 
from time to time, problems of scheduling, and the 
expense of employing instructed Ss. The use of 
magnetic tape recording eliminates the need for 
the repeated presence of paid participants, thereby 
reducing expense and problems of scheduling, and, 
at the same time, it automatically standardizes the 
major part of the experimental procedure. It should 
be clear, of course, that such a procedure imposes 
constraints on the possibilities of interaction, thus 
having its own intrinsic limitations. 
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A CASE OF ESP: CRITIQUE OF “PERSONAL VALUES AND ESP SCORES” 
BY GERTRUDE R. SCHMEIDLER 


E. I. BURDOCK 
Carnegie Corporation of New York 


ust as the lodestone depends on compensation, 
so the statistical test rests on certain governing 
assumptions which both limit its applicability and 
determine its meaning. Ignorance or disregard of 
the underlying assumptions frequently leads to 
false conclusions as to the significance or insignifi- 
cance of the results under assessment. 

In particular, one of the most commonly used 
statistical tests of significance, the ¢ test, rests on 
the assumptions that: (a) the trait whose meas- 
urement is being tested is normally distributed; 
(6) the variance of the samples is homogeneous; 
(c) the samples are a random selection. 

If any of these assumptions do not hold in a 
case under investigation, then the conclusions as to 
significance of findings may be held in doubt. 

An article in the October 1952 issue of this 
journal, entitled “Personal Values and ESP Scores” 
(3), presents an instance where use of the ¢ test 
may be questioned. In this investigation the ex- 
perimental subjects (Ss) were required to guess the 
order of items in concealed lists. The lists each 
consisted of 25 items chosen from five possibilities 
arranged in random order. The investigator ad- 
vises the reader that “with lists prepared in this 
way, the average number guessed correctly, if 
choice [sic] alone determined the guesses, would 
approximate five....’’ The binomial model in- 
voked here is thus: (4 + 4$)*®. Each “run” of 25 
guesses by a single S is treated as a discrete 
sequence of independent events. A total of 959 
runs was secured from 122 experimental Ss, ap- 
proximately eight runs per S. In the statistical 
evaluation of the results, the investigator combined 
the runs extracted from various Ss into a single 
composite for each of two groups, a group of co- 
operative Ss appropriately termed “sheep” and a 
contrasting group of recalcitrant “goats.” 

Taking the above-listed assumptions in order, 
we note the following: 

1. The universe postulated has a range of dis- 
crete values from 0 to 25 and a mean of 5. It is 
thus neither normal nor continuously distributed. 

2. No test was made for homogeneity of 
variance. 

3. The randomness of the sample is in doubt, 
since the 959 runs are not drawn at random from 
the postulated universe, but only from 122 Ss. 
Eight runs drawn from one S are not necessarily 
the same as eight runs by eight different Ss. Even 
if all Ss are responding without benefit of ESP, 
they may be expected to differ from one another in 
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their attack upon the problem. At the roulette 
wheel gamblers who employ different “systems” 
have different sequences of wins and losses, al- 
though all lose in the long run. 

A more appropriate device for the assay of a 
discrete nonnormal distribution of this type would 
be a nonparametric test such as chi square. Un- 
fortunately, when we seek to recompute the data 
in the published paper, the confounding of the 
results from different Ss makes it impossible to 
allow for the variance within Ss. What we need to 
know for the proper evaluation of this experiment 
is the number of hits and misses per S for each of 
the groups considered. However, because of the 
form in which the data are presented, this critical 
information is effectively obscured. 

Although we are unable to reanalyze the in- 
vestigator’s data as presented in her Table 1 for 
the reasons adduced above, the data presented in 
her Table 2 in terms of numbers of Ss lend them- 
selves to a chi-square analysis. Chi square for all 
Ss (sheep vs. goats) with respect to the numbers of 
Ss with averages “above,” “at,” and “below” 
mean chance expectancy, a 3 X 2 classification 
with 2 df, equals 2.1757. The probability of ob- 
taining such a value by chance is .35, which does 
not support the author’s contention that sheep 
score significantly better than goats. 

A similar analysis for Ss below the 90th per- 
centile on the Allport-Vernon Theoretical scale 
gives a chi square of .1387, and corresponding 
probability of .93, suggesting that sheep and goats 
here, too, can be considered samples of a common 
population. 

While the subsample of Ss at or above the 90th 
percentile on the Study-of-Values Theoretical 
scale yields expected values too small to justify 
computation of chi square, it is possible to compare 
all Ss above the 90th percentile with those below 
the 90th percentile by pooling sheep and goats. 
Chi square so computed, despite one weak fre- 
quency of 3.41, yields a value of 3.5385 for a 
probability of .18 that the observed differences are 
fortuitous. This result contradicts the author’s 
conclusion that dividing the Ss according to their 
scores on the Allport-Vernon Theoretical scale is 
meaningful. 

One further possibility remains for the evalu- 
ation of the number of Ss at or above the 90th 
percentile on the Allport-Vernon Theoretical scale. 
That is to combine the categories for those at, and 
those below, mean chance expectancy. The resulting 
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2 X 2 table has one expected frequency of 3.81 
with the other three between 5 and 10. Although 
this violates Lewis and Burke’s (2) criterion that 
expected cell frequencies be at least 10, the chi 
square of 1.9753, with a probability of .17 by 
chance, is sufficiently far from the borderline of 
significance to obviate use of Fisher’s direct method 
(1, pp. 96-97). The results here again support the 
hypothesis that there is only a random difference 
between sheep and goats, even theoretical sheep 
and goats. 

On the basis of these considerations the investi- 
gator’s assertion that “‘one group of our Ss scored 
above mean chance expectancy while the other 
group scored below” is not sustained. The apparent 
difference between sheep and goats and between 
Ss with high and Ss with low theoretical values in 
the investigator’s Table 1 is merely an artifact of 
the way in which the data are tabulated. 

Having found no significant differences in the 
population with respect either to attitude toward 
the experiment or with respect to Allport-Vernon 
Theoretical scale scores, we may inquire whether 
the averages above, at, and below mean chance 
expectancy differ significantly from one another 
for the population as a whole. Upon further re- 
flection, however, we realize that the three cate- 
gories refer not to crudely grouped attributes but 
to a frequency distribution. The data as presented 
have the further peculiarity that the central cate- 
gory, “averages at mean chance expectancy,” 
represents a point on the scale, while the two other 
intervals, “averages above mean chance ex- 
pectancy” and “averages below mean chance ex- 
pectancy,”’ are broad categories. To average at 
mean chance expectancy, an S must have guessed 
correctly exactly one-fifth of the approximately 200 
items presented to him; a single item difference 
casts him into one of the other categories together 
with Ss departing widely from chance. Included in 
the totals for each of the three categories, however, 
are the results for “‘some latecomers”’ who “‘missed 
the beginning of the session.”” Presumably, on the 
evidence of figures presented elsewhere in the 
article, these numbered at most nine, if we assume 
each latecomer missed only one run. However, we 
are unable to make any adjustment for unequal 
numbers in the absence of specific information, and 
therefore cannot say to what extent differences in 
the tabled results reflect this discrepancy. 

Now the probability of a point value in a discrete 
distribution is not very large. Even though the 
probability for the mean frequency is a maximum, 
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it can be shown to approach zero as N increases, 
since 


1 
= V 2xNp(1 _ >) 


For N = 200, pmax = .07. If we follow the in- 
vestigator’s binomial model, 122 (4% + 4), we 
should expect 7 per cent of the 122 Ss, or 8.54, to 
average at the mean by chance. Actually 16 Ss, 
twice the number expected by chance, lie at the 
mean. Although the data in the form tabulated 
are insufficient for computation of the dispersion, 
the piling up of cases at the mean suggests that 
the sample may have subnormal dispersion. A 
sequence with subnormal dispersion is character- 
ized by probability compensation. In such a 
sequence there is a negative aftereffect from an 
item to its immediate successor shown in a tend- 
ency toward an exaggerated alternation of re- 
sponses and an avoidance of repetitions. Such 
sequences are often produced by individuals who 
naively try to construct a random sequence, but 
who fail to realize that a certain amount of cluster- 
ing is necessary in a normal frequency distribution. 

Two possible causes for this phenomenon are: 
(a) The individual choices do not add up to a 
random sample of the Ss’ behavior because 
grouping them into runs induces an interaction 
effect. (6) The behavioral universe itself is charac- 
terized by probability transfer with negative 
aftereffect. Such sequences, known as Markofi 
chains, may nevertheless converge toward a specific 
probability value, such as that of one-fifth postu- 
lated by this investigator. However, such sequences 
will not have as many runs as a normal sequence, 
but will show an excess of alternation over a genu- 
inely random distribution. 

In conclusion we are led inevitably to the per- 
suasion that the findings presented in the above 
article are due not so much to telepathy as to 
numerology. 
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HOW NEUROTIC IS THE AUTHORITARIAN?! 


JOSEPH M. MASLING? 
Institute for Research in Human Relations 


ERY few social scientists would deny that 

social science research can be carried on inde- 
pendently of prevailing social and political pres- 
sures. The social scientist, because of his training 
and interests, is constantly under the strain of 
examining objectively phenomena about which he 
privately has many strong, emotionally-tinted 
attitudes. An interesting example of one way in 
which a certain set of these political pressures—in 
this case the current flood of loyalty investigations 
and the threat posed by current and past dictator- 
ships—has influenced social science research can be 
seen in the work done on authoritarianism. 

It is evident that the concept of the authori- 
tarian personality has been a popular area of in- 
vestigation, and that some extremely interesting 
and valuable knowledge has been made available 
by this research. It is not the purpose of this paper 
either to evaluate or to summarize this research. 
It will suffice to say that for the most part it is 
characterized by rich, insightful conceptions of 
personality and by sound experimental design. 
What will be done is to point out how in some in- 
stances otherwise sound research has unfortunately 
been rendered less effective by the descriptions 
painted of authoritarian individuals. 

To illustrate this point two of the major works 
in this field are quoted extensively—The Authori- 
tarian Personality, by T. W. Adorno, Else Frenkel- 
Brunswik, Daniel J. Levinson, and R. Nevitt San- 
ford (1), and Authority and Leadership: A Study of 
the Follower’s Orientation of Authority, by Fillmore 
H. Sanford (10). These two works were chosen be- 
cause of their unquestioned importance in the 
field and because they are in many ways superior 
to other research done in the area. 

It is evident to anyone reading the literature on 
authoritarianism that authoritarians are indeed 
nasty fellows. “They worry about egocentric and 
material things. They think in terms of blame and 
they appear to express aggression against the 
weak” (10, p. 43). Not only this, but authoritarians 
are conventional, submit uncritically in the face of 
authority, are anti-intraceptive, superstitious, and 
stereotypic in their thinking, are preoccupied with 
the “dominance-submission, strong-weak, leader- 
follower dimension,” overemphasize the conven- 
tionalized attributes of the ego, have exaggerated 
assertions of strength and toughness, are cynical 
and destructive, tend to believe that “wild and 


1 This paper was read at the 1953 meeting of the 
American Psychological Association in Cleveland, Ohio. 
* Now at Syracuse University. 


dangerous things go on in the world,” and have 
“exaggerated concern with sexual ‘goings on’” (10, 
p. 7). In addition, authoritarian men are overly 
masculine and women overly feminine (10, p. 12). 
Elsewhere, ina discussion of the “‘subsyndromes”’ 
to be found among highly ethnocentric individuals, 
the authoritarian is characterized as achieving: 


social adjustment only by taking pleasure in obedi- 
ence and subordination. This brings into play the 
sadomasochistic impulse structure both as a condition 
and as a result of social adjustment . . . . In the psycho- 
dynamics of the “authoritarian character, ” part of the 
preceding aggressiveness is absorbed and turned into 
masochism, while another part is left over as sadism 
coenl Ambivalence is all-pervasive, being evidenced 
mainly by the simultaneity of blind belief in authority 
and readiness to attack those who are deemed weak and 
who are socially acceptable as “‘victims.” ...He de- 
velops deep “compulsive” character traits, partly by 
retrogression to the anal-sadistic phase of development. 
. .. [His religious belief is compulsive and punitive; he 
has overt rigidity of conscience with] strong traces of 
ambivalence . . .. The over-rigid superego is not really 
integrated, but remains external (1, pp. 759-760). 


There is not quite as much said about the au- 
thoritarian’s opposite number, the equalitarian, 
but it is evident at once that he is a fairly well-ad- 
justed individual, albeit a trifle dull. He has no 
interesting psychiatric syndromes and is, in gen- 
eral, a more amiable, social creature than the 
authoritarian. According to F. H. Sanford, equali- 
tarians “seem inclined to take a rational and im- 
punitive position in a situation of social stress and 
they demonstrate a tendency to be relaxed and 
mild in the face of personal inconvenience at the 
hands of an inferior” (10, p. 47). In relating to 
authority figures, equalitarians “talk in terms of 
‘fairness’ and ‘kindness’ and ‘warmth,’” instead of 
using the “cold bromides” of the authoritarian. 


Equalitarians again demonstrate a relatively ra- 
tional and relaxed feeling for leaders [in contrast to the 
authoritarian who regards authority with ambivalence]. 
... In the face of strong authority, they are more in- 
clined to observe calmly what will happen than to fly off 
immediately into either acceptance or rejection. .. . In 
the face of strongly directive leadership they adopt a 
rational, group-centered position instead of coming 
down with petulance and maladaptive resentment 
(10, pp. 94-95). 

The “genuine liberal” is only one of five different 
syndromes to be found among individuals low in 
ethnocentrism, according to Adorno et al (1). Their 
description of the “genuine liberal” is consistent 
with the characterization of the equalitarian given 
by F. H. Sanford: 
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How Neurotic Is THE AUTHORITARIAN? 


His ego is quite developed but not libidinized—he 
is rarely “narcissistic.” At the same time, he is willing 
to admit id tendencies, and to take the consequences— 
as is the case with Freud’s “erotic type.” One of his 
conspicuous features is moral courage, often far beyond 
his rational evaluation of a situation. He cannot “keep 
silent” if something wrong is being done, even if he 
seriously endangers himself. Just as he is strongly 
“individualized” himself, he sees the others, above all, 
as individuals, not as specimens of a general concept 
(1, p. 781). 

These descriptions are summarized in Table 1. 
It is readily apparent that the authoritarian comes 
out second best. If this characterization is accurate, 
then it is reasonable to expect statistically signifi- 
cant differences between authoritarians and 
equalitarians on a test of mental health. After all, 
the authoritarian is compulsive, punitive, rigid, 
analsadistic, etc., while the equalitarian is more 
flexible, erotic, rational, relaxed, etc. 

Four separate studies have been conducted 
which present data on precisely this problem. In 
each of these studies a different measure of authori- 
tarianism and a different criterion of mental health 
have been used. Yet in none of them has a statisti- 
cally significant relationship been found between 
authoritarianism and mental health. It may be easy 
to discount any one study which fails to confirm 
the experimental hypothesis on the grounds that 
the operational measures of the independent and 
dependent variables are inappropriate or weak. 
However, when the inference that authoritarians 
are more neurotic than equalitarians cannot be 
confirmed in any one of four studies using eight 
different measures of the dependent and indepen- 
dent variables, it begins to appear that the infer- 
ence and its underlying theory are in error. These 
four studies are presented below; only a brief ac- 
count of each is given, since more detailed descrip- 
tions are available elsewhere. 

1. Ethnocentrism in relation to the Minnesota 
Multiphasic Personality Inventory. Curiously 
enough, the very first study in this area was re- 
ported in the same volume that presented a neuro- 
tic picture of the authoritarian. In this study, 
scores on the California Ethnocentrism (E) Scale 
were compared with scores on the MMPI. The sub- 
jects were 34 men and 48 women, all of whom were 
patients in the Langley Porter Clinic in San Fran- 
cisco, an institution for the diagnosis and treatment 
of psychiatric disorders. “Comparisons of average 
scores on the various MMPI scales for the four 
E(thnocentrism) quartiles and preliminary inspec- 
tion of individual and group test-profiles failed to 
show large or consistent relationships between E 
and psychiatric syndromes as measured by this 
inventory” (1, p. 910). 

“As far as statistical significance of most of the 
results is concerned much is left to be desired. The 
scope of the investigation did not permit the use of 
many more than 120 subjects. For many of our 


317 


TABLE 1 


COMPARISON OF PERSONALITY CHARACTERISTICS 
ASCRIBED TO AUTHORITARIAN AND 
Ls 


EQUALITARIAN 





. Strongly individua- 
lized (1, p. 781) 

. Relaxed and mild (10, 
p. 47) 


. Conventional (1, pp. 
229-232; 10, p. 7) 

. Aggressive to uncon- 
ventional individuals 
(1, pp. 232-234; 10, 
p. 7) 

. Anti-intraceptive (1, 
pp. 234-235; 10, p. 7) 

. Superstitious (1, pp. 
235-236; 10, p. 7) 

. Stereotyped in think- 
ing (10, p. 7) 

. Cynical (1, pp. 238- 
239; 10, p. 7) 

. Destructive (1, pp. 
238-239; 10, p. 7) 

. Concern with sexual 
identifications (10, p. 
121) 


. Readiness toward in- 
traception (1, p. 466) 

. Scientific-naturalistic 
attitude (1, p. 464) 


. Moral courage (1, p. 
781) 


. Willing to admit id 
tendencies and take 
the consequences (1, 
p. 781) 

. Sadomasochistic (1, p. 9. 
759) 
10. Ambivalent toward au- 10. 
thority (1, p. 759; 10, 
p. 94) 

11. Compulsive (1, p. 759) 11. 

12. Analsadistic (1, p. 759) 12. 

13. Punitive (1, p. 409) 13. 

14. Cold (10, p. 77) 14. 

15. Directive (10, p. 79) 15. 


Rational and relaxed 
feelings for leaders 
(10, p. 94) 

Flexible (1, p. 463) 
Erotic (1, p. 781) 
Impunitive (10, p. 43) 
Warm (10, p. 77) 
Nondirective, group- 
centered (10, p. 168) 
16. Maladaptive (10, p. 16. Rational (10, p. 95) 
95) 





comparisons this group had to be divided into 
many small subgroups. Taken one by one, most of 
the numerical results therefore are not statistically 
significant, nor otherwise impressive” (1, p. 961). 

2. Authoritarianism and Personal Security Scale 
scores. In 1949, 963 residents of Philadelphia, se- 
lected from 24 random census tract areas, were 
interviewed in a study of the leader-follower rela- 
tionship. In the course of this interview the sub- 
jects were administered the Authoritarianism- 
Equalitarianism (A-E) Scale, and the Personal Se- 
curity (P.S.) Scale. Both these scales have demon- 
strated that they can be used validly for specific 
purposes, the A-E scale in predicting responses on 
questions regarding leadership (4, 10) and the P.S. 

* This and the following two studies are part of a 
larger series of investigations of leadership being con- 
ducted by the Institute for Research in Human Rela- 
tions. This research is sponsored by the Office of Naval 
Research (Contract No. N8onr-69401). Mr. F. Loyal 
Greer, of the Institute staff, was responsible for the 
analysis of the data concerning personal security scale 
scores and authoritarianism. 
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scale in successfully differentiating between hos- 
pitalized psychoneurotics and interns (6). 

A Pearson correlation coefficient computed be- 
tween the scores on these two scales was — .03, with 
a standard error of .071. A ¢ value for the difference 
in mean scores was .48. It is evident that the rela- 
tionship between personal security and A-E is not 
statistically significant (4, p. 103). 

3. Relationship between scores on the California 
F test and the Rotter Incomplete Sentences Test.‘ 
Sixty-four students attending summer school 
classes in introductory history and educational 
psychology at Temple University were adminis- 
tered the California F(ascism) test (forms 45 and 
40) and the Rotter Incomplete Sentences Test (8). 
This particular version of the sentence completion 
method has been found useful by several investi- 
gators (2, 7, 9); the F test is widely used and is 
described by Adorno et al. (1). 

A Pearson correlation coefficient computed be- 
tween these two measures was not statistically 
significant (r = —.03). To maximize whatever dif- 
ferences there might have been, the 14 most equali- 
tarian and the 14 most authoritarian subjects were 
selected, and a comparison of their incomplete 
sentences scores made. This ¢ test showed that the 
differences were not significant. Bearing in mind 
that the authoritarian is theoretically ambivalent 
toward parents and authority figures, the scores 
made on the sentence stems “back home,” “a 
mother,”’ and “my father” were averaged and re- 
lated to the F-test score. The Pearson correlation 
coefficient again was not significant (r = —.04) 
(4, p. 138). 

4. A comparison of the authoritarianism of 
nonhos pitalized recruits and recruits under observa- 
lion in a recruit special evaluation hospital. The 
measure of authoritarianism used in this study, 
called the Authority Acceptance Scale (AAS), was 
constructed to aid prediction about the recruit’s 
adjustment to a military situation (5). The meas- 
ure of mental health used was confinement, or lack 
of it, in a Navy hospital used for the observation 
of those recruits who have been unable to adjust 
to the Navy situation. The hospitalized recruits are 
described in an official Navy publication as follows: 
“The prevalent pattern of these recruits seems to 
include pronounced evidence of immature behav- 
ior, feelings of dependency in conflict with hostile 
actions against the environment and sufficient dis- 
play of emotional instability to produce malad- 
justment”’ (3). 

The subjects of this study consisted of the entire 
population (V = 49) of recruit patients in the 
Bainbridge, Maryland, Naval Training Center 
Special Evaluation Unit Hospital.* Their scores on 


* Drs. Clarence Smeltzer, Richard Martin, and Harry 
Woehr of Temple University were instrumental in 
providing Ss for this study. 

5 The author is indebted to Lt. W. B. Lyon of the 
Special Evaluation Hospital at sainbridge who was 
responsible for the administration of the AAS. 
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the AAS were cor pared with the scores of 1,000 
nonhospitzlized recruits. A ¢ test computed for the 
difference between these means equaled 1.89, a 
figure which does not reach statistical significance. 

In the light of these data it seems evident that 
the characterization of the authoritarian has been 
overdrawn. All things evil have been posited in one 
end of the distribution; all things healthy and 
democratic have been attributed to the other. It 
has been natural to slip from a rigorous, scientific, 
“show-me” frame of reference, to one which has 
been heavily influenced by the researcher’s concern 
about world conditions. The concept of the author- 
itarian personality may be a valuable, heuristic 
tool, but only if it can be divorced as much as pos- 
sible from value judgments. There seems to have 
been a tendency to use the term “authoritarian” as 
a mild profanity which one could use to describe 
other people (never oneself). It is suggested that 
the concept be re-examined in the light of the data, 
purified of value judgments, and put to use again 
as a research instrument. 
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THE LIFE AND WorK OF SIGMUND FREvp. Volume 
I. THE FORMATIVE YEARS AND THE GREAT 
DISCOVERIES, 1856-1900. By Ernest Jones. 
New York: Basic Books, 1953. Pp. xiv + 
428. Price $6.75. 

The first volume of the trilogy Dr. Jones prom- 
ises is a book of unparalleled interest and impor- 
tance for psychologists of all schools and theoretical 
persuasions. It presents an absorbing story which 
will never be more fully nor better told. The his- 
torical importance of Freud and his ideas hardly 
needs to be labored, and it is perhaps enough to 
say that this book is, in the reviewer’s opinion, the 
best available introduction to an understanding of 
the man and of psychoanalysis as he developed it. 
For it presents the work as well as the life of Freud, 
and carefully traces the development of psycho- 


analytic ideas up to their first great climax in The. 


Interpretation of Dreams. 

The organization of this first volume, and pre- 
sumably of the whole work, is basically chronologi- 
cal. At the same time, however, important topics 
are treated as a whole within the general frame- 
work, so that all the facts about episodes in 
Freud’s life, such as his extraordinary friendship 
with Fliess, are told together. The result is a satis- 
factory compromise, although there is inevitably 
some overlap and repetition, and the reader 
does not get an integrated picture of all that was 
going on in Freud’s complicated life stream during 
certain crucial periods of his mature years. 

We begin with an interesting chapter on Origins, 
which tells a little about Freud’s family and— 
surprisingly—quite a lot about his early childhood. 
This chapter becomes even more meaningful after 
one has read the rest of the book. The story of 
Freud’s school years is less meaty, but many inter- 
esting facts begin to emerge when his higher educa- 
tion is discussed. Hardly “‘progressive” by many 
modern standards, Freud’s university let him go 
through medical school at his own pace, give him- 
self a liberal education as he went along, and carry 
out original anatomical research, the first report of 
which was published before he was 21! True, he 
was almost 25 when he got his M.D., but he had 
been involved in half a dozen pieces of research, 
several of them published, had translated a volume 
of J. S. Mill, and had had a year of compulsory 
military service. Not bad for one’s first quarter 
century, at least by present standards. 

It is tempting at this point to go into a more 
extended description of Freud’s personality as it 
emerges in Jones’s account, but space does not 
allow it. The fascination of this case goes far be- 
yond the interest inherent in any thoroughly 
studied life, and beyond what is implied in the 
fact that we can at last turn the tables on the 
great analyzer of others. Even if he had been an 
unknown, we should have been intrigued by the 


human interest of his story. But the effect is all 
the greater because the grave and dignified figure 
of the later photographs is suddenly revealed to 
have been an intensely passionate, vivid, fallible, 
and human person, torn with inner conflicts and 
neurotic troubles of his own. Titan though he was 
intellectually, he made his historic discoveries with 
pain, self-doubts, and tremendous labor. The 
achievement seems the more remarkable when we 
learn it was done against a background of suffer- 
ing from illnesses that would have incapacitated 
a man without his tremendous self-control—neuri- 
tis, sinusitis, migraine, “‘neurasthenia,”’ and others. 

The fact that this biography has made the best- 
seller lists should not be surprising, but may lead to 
some misconceptions. Jones has not abated a whit 
of his customary dignity and urbanity in writing 
this book. He has sought no sensationalism; most 
of his writing is clear, occasionally it is witty and 
graceful, and it makes effortless reading in the 
narrative sections. But he has made no effort to 
spare the reader. When it comes to theoretical or 
technical issues, Jones certainly makes good his 
expressed intention to write for his psychoanalytic 
colleagues. The great merit of the book lies in its 
material—painstakingly gathered, worked over 
with loving care, presented readably—rather than 
in what Jones does with it. How often does it hap- 
pen that an intimate history can be written about 
one of the great movers and shakers of the civi- 
lized world, so that we can follow day by day the 
development both of a tempestuous romantic love 
affair and of the painful wresting of basic discover- 
ies from that most stubborn of nature’s recesses, 
the human mind? 

It is particularly ironic that the materials exist 
that enable Jones to tell us this story about a man 
who had such a strong sense of privacy, and who 
took pains to prevent any future public knowledge 
of his private life even when he was thirsting for 
fame. Yet even though he several times destroyed 
personal records and diaries, he was betrayed by 
his own magnificent articulateness and by his need 
to express himself to someone from whom he got 
emotional support. Hence the two series of letters, 
to his fiancée and to his friend Fliess, which have 
fortunately been preserved. Good use has been 
made of them, but the reader of this book will not 
only look forward with impatience to the forth- 
coming translation of the Fliess correspondence, 
but will clamor for the publication of the love 
letters. May I add a speculation that it is not 
wholly accidental that Freud did express himself 
so fully and openly to people who were not likely 
to destroy his letters? It seems not too much to 
hazard a guess that where so much need for privacy 
is coupled with a longing for fame as great as 
Freud’s, there exists some unconscious exhibition- 
ism that may find obscure expression. 
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Yet to Dr. Jones such speculations apparently 
would seem to go too far. The book is remarkable 
for the lack of psychoanalyzing in it. Its author has 
limited himself primarily to a painstaking recon- 
struction of the facts; indeed, sometimes he gets a 
little lost in a scholarly enthusiasm for exact dating 
or placing or for unearthing facts that are not given 
much significance. On the whole, however, most 
psychological readers will find this approach 
preferable to one of overselectiveness and luxuriant 
interpretation. The facts receive a modicum of 
basic, rather conventional psychoanalytic interpre- 
tation and are at times related to specific character 
traits, but without any attempt to integrate them 
into an intelligible portrait. 

There is some evidence that Jones deliberately 
eschewed interpretation at a number of points. For 
example, on page 85, he raises the question of what 
the basis of Freud’s self-punishing trends was, but 
says: “The answer to this question we must leave 
to the psychoanalysts.”’ In large part, then, his 
intention may have been to furnish all the facts 
to his colleagues, knowing that they would make 
their own interpretations anyway! Readers who 
can take such unexplained terms as “empyema of 
the antrums’”’ in stride can certainly be expected 
to be able to supply the hypothesis that Freud’s 
irrational antagonism toward Breuer after 1895 
had in it elements of “transference from earlier 
figures in his life—ultimately his father” (p. 308). 
Yet Jones does give us just this kind of psycho- 
analytic interpretation. 

Even more puzzling, the author’s interpreta- 
tions occasionally seem surprisingly superficial and 
unconvincing to anyone analytically informed. 
Perhaps the most amusing is Jones’s labored at- 
tempt to explain why Freud, who was profes- 
sionally so interested in sex, nevertheless ‘“‘dis- 
played less than the average personal interest in 
what is often an absorbing topic.” He suggests 
“that Freud’s interest in sexual activities, like that 
in the aphasic disturbances of speech, came from 
the fact that sexuality has so obviously both 
physical and mental components” (p. 272). Then, 
too, he uncritically accepts Freud’s own “‘interpre- 
tation” of his alleged lack of interest in the thera- 
peutic aspects of medicine: “My innate sadistic 
disposition was not a very strong one, so that I 
had no need to develop this one of its derivatives” 
(p. 28). This in spite of voluminous evidence, 
faithfully set forth, about fairly overt sadism in 
Freud’s childhood, unusually intense hatreds in 
his adult life, and countless indirect manifestations 
such as depressions, somatizations, etc. In general, 
Jones makes very little use of aggression as an ex- 
planatory principle, nor does he refer often to 
specific defenses other than repression. Such omis- 
sions may be due to the fact that psychoanalysis 
as he learned and practiced it for many years was 
essentially a single-instinct (libido) theory and 
an id psychology. Like Freud himself, many of the 
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oldest generation of analysts never fully assimilated 
the implications of the dual instinct theory nor of 
ego psychology into their thinking. 

Yet in many other contexts that do not directly 
involve hostility, jones shies away from obvious 
interpretations. I have not counted the references 
to fame that Freud made, quoted in this book, 
but there are scores of them, alternating between 
rather straightforward wish-fulfilling fantasies of 
world renown and frequent denials (technically, 
negations) that he had any interest in leaving his 
name carved on a rock for future generations. Yet 
Jones accepts the protestation, apparently without 
question or considering the possibility that the wish 
was strong but the defense almost equally so. 

Jones’s exposition of Freud’s early ideas is clear 
and helpful, though not particularly critical. 
Throughout, he tries manfully to be unbiased, and 
does indeed make many statements that are any- 
thing but praising; on the whole, he achieves a 
surprising degree of objectivity, considering how 
hard it is to be objective about so controversial a 
subject. But I am reluctantly forced to conclude 
that the interpretative weaknesses mentioned de- 
rive from a limitation in Jones’s ability to conceive 
his hero in unflattering terms. 

A couple of other criticisms might be made. 
Though Jones does a good job of picturing the 
intellectual tradition and context out of which 
Freud’s ideas came, the book is weak on the social 
and cultural influences on his life and thought. 
For example, he does not fully convey to the reader 
the extraordinary prestige that university appoint- 
ments, particularly professorships, enjoyed in 
Vienna. Nor does he develop much the part that 
Freud’s Jewish cultural heritage played in his life 
and thought. 

A final minor point: psychoanalytic technique, 
as Jones consistently depicts it, seems extremely 
passive, nonmanipulative, anything but directive 
or interventive. Certainly the new method that 
Freud developed was all of this, in contrast to con- 
ventional and prevailing psychiatric treatments, 
but today the emphasis gives a distorted impres- 
sion. For example, Jones hardly mentions inlerpre- 
tation in his description of psychoanalytic therapy. 

Some of this criticism may be premature. After 
all, two more volumes are to appear, and in them 
Jones may plan to do much that he has left un- 
done in this first volume. It stands alone to such a 
great extent, with a real dramatic unity, that one 
tends to forget that most of the work has not yet 
been done. At any rate, a great wealth of fact 
about Freud’s first 44 years has been made avail- 
able. If Jones plans to end the project by a sum- 
mary portrait, it would give us further reason to 
look forward even more eagerly to volumes 2 and 3. 

Rosert R. Hott 
Research Center for Mental Health, 
New York University. 
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